Skip to main content
PLOS One logoLink to PLOS One
. 2025 Nov 20;20(11):e0333822. doi: 10.1371/journal.pone.0333822

Cross-modal fusion of brain imaging and clinical data for Parkinson’s disease progression prediction

Jinyu Wen 1, Amei Chen 2, Jingxin Liu 3, Hua Xiong 4, Meie Fang 1,*, Xinhua Wei 2
Editor: Nima Broomand Lomer5
PMCID: PMC12633903  PMID: 41264649

Abstract

Background: Machine learning shows great potential in science but struggles with complex, high-dimensional multi-omics data. PD progression is long, diagnosed mainly by clinical signs. This paper proposes a novel decision fusion method to improve the precision of the classification of progression of PD using imaging with clinical data.

Methods: A Cross-Modal Fusion Prediction Model (CMFP) is proposed, with key steps that involve data preparation, modelling, and prediction. The data encompasses three modalities: clinical, DTI (diffusion tensor imaging), and DAT (dopamine transporter), with Lasso used for the selection of features. Individual modalities are classified using AdaBoost and the results are integrated into the new fusion strategy, CMF, to obtain a novel model. Finally, this model is used for predictions.

Results: The predictive performance of CMFP on the progression of PD achieved an AUC of 77.91%. This represents improvements of 24.48%, 30.78%, and 32.7% in AUC compared to predictions solely with clinical data, DTI data and DAT data, respectively. The combined prediction of clinical and DTI data demonstrated statistical significance compared to predictions based solely on clinical data, with a p-value of 9.183e-4. Additionally, this method identified crucial brain regions and important clinical metrics associated with PD. It should be noted that using the DTI metric along the perivascular space (DTI-ALPS) to predict and evaluate the progression of PD has relatively more advantages compared to DTI-clinical fusion prediction. Among them, the ACC can increase by 3.85%.

Conclusion: The results indicate that CMFP is effective, contributing to overcoming the limitations of low predictive performance in single-modal data and enhancing the accuracy of the PD progression predictions.

Introduction

Parkinson’s Disease (PD) is a prevalent progressive neurodegenerative disorder characterised by tremors at rest, bradykinesia, rigidity, and postural instability, predominantly affecting the human nervous system and motor control [1]. The prolonged progression of the disease significantly affects quality of life, leading to physical disabilities and nonmotor symptoms, and is associated with increased mortality rates. In such scenarios, early prediction of the disease is crucial for implementing appropriate intervention measures [2]. However, there is currently a lack of reliable clinical outcomes and/or biomarkers for progression, with clinical evaluations potentially being time-consuming and subject to variations based on patient conditions [3]. Although conventional MRI excels at providing tissue contrast, its value for suspected patients with PD mainly lies in excluding concurrent brain diseases rather than confirming PD diagnoses [4]. Given the heterogeneity of PD, which can be categorised into various subtypes based on the age of onset, clinical manifestations, and progression speed [5], predicting the progression of PD becomes especially imperative.

Several studies have indicated that changes in white matter microstructure occur before the loss of cortical neurones, even in the absence of apparent grey matter atrophy. DTI technology has been widely used to assess microstructural damage in white matter in patients with PD [68]. For example, studies have suggested that the measurement of the free water content in the substantia nigra using DTI can predict changes in motor slowness and cognitive status in PD patients in the next year [9]. A recent study using DTI revealed significant changes in the results of the whole brain voxel of fractional anisotropy (FA) and mean diffusivity (MD) of the corpus callosum, especially in its regions of the knee and body. Moreover, the decrease in corpus callosum FA was closely related to the decline in FA and MD in widespread cortical and subcortical regions [10]. Multivariate regression analysis further confirmed a negative correlation between corpus callosum FA values and the severity of motor stiffness in PD patients, with the strongest impact observed in the anterior part of the corpus callosum [10]. These findings highlight the potential of corpus callosum microstructural changes as biomarkers for motor stiffness symptoms and disease progression, even in the early stages of PD.

Machine learning (ML) stands as a promising pivotal technology in the prediction domain [11]. ML, especially when combined with data mining techniques, is devoted to advancing algorithms that learn patterns from known data to form models, which are then applied to unknown data to forecast outcomes [12,13]. Consequently, ML has been extensively applied to predict PD and its progression, with the aim of improving its performance [5,1417]. Currently, statistical models that predict the clinical progression of PD present challenges. Previous univariate longitudinal or multivariate analyses from cross-sectional studies have limitations in predicting individual outcomes or specific time points [18,19]. Existing research has shown that the construction of multimodal and hybrid models can facilitate the exploration of progression of PD [20]. In this paper, we used the ML technique and proposed a cross-modal fusion decision-making approach to address the limitations of low predictive performance in single-modal data.

In investigating the mechanisms of progression of PD, Yang and his team focused on changes in brain structural connectivity, revealing significant structural pattern transitions in eight core regions of the cerebral cortex and subcortex as PD progressed [21]. Simultaneously, Jain and his team successfully built an efficient model to predict the Unified Parkinson’s Disease Rating Scale (UPDRS) using collected noninvasive language test data, helping to improve remote monitoring of the progression of PD [22]. Although DTI data in the entire brain have significant potential to describe early microstructural changes in PD, research that integrates DTI with clinical characteristics to predict PD progression is relatively limited. Our work makes the following contributions:

  • (1) A cross-modal fusion prediction method (CFMP), which was used to predict the progression of Parkinson’s Disease. The AUC values of fusion prediction using the CFMP are improved by 24.48% and 30.78% compared to using only clinical or DTI data to predict the progression of PD, respectively.

  • (2) This paper reveals key brain regions and important clinical characteristics that are closely associated with the pathological progression of Parkinson’s Disease. The results of the prediction of fusion using selected key clinical and DTI data show an improvement of 13.82% in AUC compared to the use of all combinations of characteristics for the prediction.

  • (3) Using the DTI metric along the perivascular space (DTI-ALPS) to predict and evaluate the progression of PD has relatively more advantages compared to DTI-clinical fusion prediction. Among them, the ACC can increase by 3.85%.

Methods

Data acquisition

This research relied on the Parkinson’s Progression Markers Initiative (PPMI) database, which contains longitudinal data from patients with PD in multiple centres. All PPMI sites have obtained approval from their respective ethics committees, and all PPMI participants provided their written informed consent prior to participation. The severity of PD was assessed using the Hoehn and Yahr Scale (HYS), typically based on factors such as decline in motor skills, reduced quality of life, and dopaminergic losses, categorising PD into five stages (stages 1-5). Inclusion criteria comprised: 1) patients with PD with complete HYS scores followed longitudinally for at least 5 years; 2) baseline data that included comprehensive MRI images (3D T1WI and DTI images); 3) MRI scans performed on a 3T Siemens MRI machine. Exclusion criteria included: 1) incomplete clinical data; 2) poor image quality or errors in image processing.

A total of 123 patients with PD were included in the study. By assessing the longitudinal changes in the HYS scores of these patients over 5 years in the PPMI database, it was found that 74 patients had scores higher than baseline, 46 had scores identical to baseline and 3 had scores lower than baseline. In this research, patients with HYS scores higher than baseline were classified into the progression group (n = 74), while those with scores the same as or lower than baseline were classified into the stable group (n = 49). The research also encompassed extensive clinical data, including gender, age, years of education, PD motor assessment scales, and cerebrospinal fluid markers: Aβ42, a-syn, t-tau, and p-tau. The 123 participants underwent non-contrast-enhanced 3D volumetric T1-weighted MRI and DTI scans using a 3T Siemens MRI scanner. For DTI image pre-processing, the PANDA software was employed, which primarily involved several steps: 1. Removal of non-brain tissues; 2. Correction for eddy current effects and minor head motion; 3. Computation of the diffusion tensor matrix based on voxels; 4. Fibre allocation using the FACT algorithm to produce deterministic fibre tractography. If the angle of curvature exceeded 45 ° or the FA of the voxel was less than 0.2, the trajectory was terminated. In the ML algorithm modelling, we emphasised the selection of baseline clinical metrics, DAT markers, and DTI brain network metrics related to the progression of PD as features. Data screening was performed using the Lasso method to improve model accuracy and generalisability.

Cross-Modal Fusion prediction model: CMFP

We used baseline clinical metrics, DAT markers, and DTI white matter MD (50) to separately train machine learning models to predict the progression of PD five years later. In the process of single-modal modelling, we utilised the AdaBoost algorithm (Adaptive Boosting) for classification. AdaBoost is a boost method that combines multiple weak classifiers to form a strong classifier. Its adaptiveness lies in the fact that the weights of the samples misclassified by the previous weak classifier (corresponding to the weights of the samples) are enhanced. After the weights are updated, these samples are used again to train the next weak classifier. In each training round, a new weak classifier is trained on the entire dataset, generating new sample weights and the influence of that weak classifier. This process iterates until a predetermined error rate is achieved or a specified maximum number of iterations is reached. AdaBoost can adaptively adjust the assumed error rate based on feedback from weak classifiers, demonstrating high efficiency. Some researchers have used AdaBoost for early prediction of PD [23].

To further improve the performance of univariate baseline data in predicting progress after 5 years, we adopt a cross-modal data combination approach and propose a new decision strategy to improve predictive performance. Fig 1 is the framework of this paper, including data pre-processing, feature selection, modelling, and prediction. The specific steps of CMFP are as follows:

Fig 1. The frame of the method in this paper is composed of four parts: data preprocessing, feature selection, classifier modeling and prediction.

Fig 1

  1. Set the epoch parameter to 120. Adam is used as the optimisation algorithm. The initial learning rate is 0.01, and the learning rate decreases by 0.1 every 30 epoch. The calculation of the loss adopts the computation process of the Mean Squared Error (MSE).

  2. The dataset was first randomly partitioned into training sets (60%), validation sets (30%) and test sets (10%). Subsequently, the selection of features using the Lasso method was performed exclusively on the training set, ensuring that no information leakage was induced from the validation or test sets.

  3. The training set data of different modes were trained by AdaBoost, and the obtained models were predicted by verification set data. The probability, label and accuracy of prediction were obtained, and the single-modal trained models were saved, respectively.

  4. The results of different single-modal predictions (probability, label, and accuracy) were input into the multimodal fusion algorithm (CMF: Cross-Modal Fusion), a set of new prediction results (probability, label and accuracy) were obtained, and the fusion model was saved;

  5. Repeat 2-5 steps until epoch=120;

  6. Find the model with the highest accuracy saved during training of single-modal and cross-modal fusion, as Ms and Mc;

  7. Input the test set data into model Ms to obtain predicted results (probability, predicted label, and accuracy), then input these results into model Mc for testing to obtain the fused prediction probability and labels from the two modalities.

Original input clinical data included 16 clinical variables, including 3 basic information (age, gender, and years of education), 9 motor and cognitive scores (UPDRS Part I score, UPDRS Part II score, UPDRS Part III score, Total UPDRS score, ESS score, GDS score, UPSIT score, RBDSQ score, MoCA score), and 4 Cerebrospinal fluid (CSF) protein concentrations (Aβ42, a-syn, t-tau, p-tau). We used the Lasso method for data selection. As shown in Table 1, we pick four clinical variables (age, UPDRS Part III Score, UPDRS Total Score, UPSIT score) with high correlation, which represent age, a score to evaluate motor function in PD, total score of the Unified PD Rating Scale (UPDRS) and the University of Pennsylvania Smell Identification Test, respectively. Among them, the UPDRS provides a comprehensive and detailed assessment of the severity of the disease. UPSIT score refers to the University of Pennsylvania Smell Recognition Test. For DTI variables, we used the MD value of the brain region to predict and 7 characteristics (Splenium of Corpus Callosum, Fornix (Column and Body of Fornix), Inferior Cerebellar Peduncle (left), Superior Cerebellar Peduncle (left), Fornix (Crescent)/Straxia Terminalis (right), Tapetum Right, Tapetum Left). They represent seven different regions of the brain. For DAT data, four characteristics (left putamen, high putamen, low striatum, high striatum) were finally selected, which represent the left caudate nucleus, the right caudate nucleus, high putamen, low putamen, high striatum, low striatum, respectively. These nuclei are closely related to the pathogenesis of PD. In addition, in patients with PD, dopamine deficiency in the striatum leads to the corresponding motor symptoms.

Table 1. The features of the strongly correlated combination were selected from three datasets (Clinical, DTI, and DAT) using the Lasso method.

Clinical UPDRS Part III Score UPDRS Total Score Age UPSIT score
DTI Splenium of Corpus Callosum Fornix (Column and Body of Fornix) Tapetum Right
Inferior Cerebellar Peduncle (Left) Superior Cerebellar Peduncle (Left) Tapetum Left
Fornix (Crescent)/Stria Terminalis (Right)
DAT Putamen Left High Putamen Low Striatum High Striatum

In our proposed prediction method, a critical component lies in the decision strategy (CMF) (specific steps as Algorithm 1). By applying CMF to individual predictions derived from single-modal data, we obtain the final prediction result (the predicted labels and their associated probabilities).

Algorithm 1: Cross-Modal Fusion (CMF).

graphic file with name pone.0333822.e090.jpg

Before the decision-making process, we obtain predictions from two single-modal datasets, including predicted labels, predicted probabilities, and accuracy values. We denote them as ρa, ρb, ηa, ηb, φa and φb, respectively. The single-modal prediction results are then processed through our proposed decision strategy CMF to generate the final fused prediction result Φ, which comprises both the predicted probability and label, i.e., Φ=[ρ, η].

We define three key metrics for the fusion process: category difference: Ω=ρaρb (the difference between predicted categories from two modalities); accuracy difference sign: σ=φaφb (indicating which modality has higher accuracy); probability difference: υ=ηaηb (the difference between predicted probabilities). The final predicted labels and probabilities are determined through different decision branches based on the values of Ω, σ, and υ. The probability adjustment uses the formula ηnew=ι/(1+eα*υ), where, α is set to 0.4 and ι is set to 1.

For the entire algorithm covered in this paper, we are programming in Python 3.7.9 with Pytorch, running on the Linux platform (all experiments are carried out on the GNU/Linux x86 64 system of GeForce RTX 3090 Ti 12 Intel Core Interl(R) Xeon(R) CPU E5-2678 v3 2.50GHz 64GB RAM device).

Ethics

For the PPMI data used in this research, all participants provided written informed consent approved by the institutional review board of each participating institution.

Statistics

To evaluate the effectiveness of the CMFP method, this study compared the performance of the model with single or dual feature prediction models and different prediction models of machine learning algorithm replacement strategy. When comparing the predictions using a single feature and dual features, the main metrics used were the area under the receiver operating characteristic (ROC) curve (AUC), specificity (SPE) and sensitivity (SEN). Additionally, the model’s performance was comprehensively evaluated using mean absolute error (MAE) and F1 score. To evaluate the overall dataset, we employed the Mann-Whitney U test to assess the differences in ROC curves between various modal fusion models and single-modal models. Specially, in the prediction model that combines DTI and clinical data, a p-value of approximately 9.183×104 was obtained, which is <0.05 and confirming the statistical significance of the model. When compared with different machine learning algorithm replacement strategies, the main metrics used were AUC under the ROC curve and accuracy (ACC), calculated using an 8-fold cross-validation method. Moreover, the model’s performance was comprehensively evaluated using root mean square error (RMSE), MAE, and F1 score.

Results

In our approach, the classification step primarily serves to transform predicted probabilities into labels, considering the two possibilities of disease progression: improvement and deterioration. Consequently, classification is carried out for these two categories. We analyze the experimental results from three key perspectives: (1) comparison of model performance with versus without CFM; (2) evaluation of different fusion strategies; and (3) comparison with existing PD progression methods.

Performance with and without CMF integration

Using a single or dual modality of data to predict PD progression leads to different results. Table 2 presents the average AUC values for the separate prediction of PD progression using each of the three modalities separately. It can be seen that each AUC value is relatively low. In Γi, where i represents the number of characteristics that are randomly combined. The specific combination of characteristics is shown in Table 3. Γ21, Γ23, and Γ50 represent all combinations of features for each modality. It is noteworthy that the AUC values of the single-modality baselines are close to or even slightly lower than the level of random guessing. This highlights the limitations of single data sources in characterizing complex neurological diseases, as the information they provide may be insufficient to construct effective predictive models.

Table 2. The average AUC values for the separate prediction of PD progression using three (Clinical, DTI, and DAT) modalities.

Γ represents feature selection, Γi represents feature combinations, and i represents the number of feature combination. Different data has different feature combinations, as shown in Table 3

Γ2 Γ3 Γ4 Γ5 Γ6 Γ7 Γ21 Γ23 Γ50
Clinical 0.4077 0.4371 0.4515 0.4478 - - - 0.4629 -
DTI 0.4208 0.4476 0.4489 0.4310 0.4770 0.4191 - - 0.4381
DAT 0.5017 0.4767 0.4631 - - - 0.4506 - -

Table 3. δi, γi and ϑi represent feature combinations for clinical, DTI, and DAT data, respectively, where i represents the number of feature combinations.

UP3: UPDRS Part III score, PTs: UPDRS Total Score, Uts: UPSIT score, STs: STAI scores. SCC: Splenium of Corpus Callosum, Fcb: Fornix (Column and Body of Fornix), ICl: Inferior Cerebellar Peduncle (Left), SCl: Superior Cerebellar Peduncle (Left), FcSr: Fornix (Crescent)/Stria Terminalis (Right), TaR: Tapetum Right, TaL: Tapetum Left. PuL: Putamen Left, HPu: High Putamen, LSt: Low Striatum, HSt: High Striatum.

δ2 Age UP3
δ3 Age UP3 PTs
δ4 Age UP3 PTs Uts
δ5 Age UP3 PTs Uts STs
γ2 SCC Fcb
γ3 SCC Fcb ICl
γ4 SCC Fcb ICl SCl
γ5 SCC Fcb ICl SCl FcSr
γ6 SCC Fcb ICl SCl FcSr TaR
γ7 SCC Fcb ICl SCl FcSr TaR TaL
ϑ2 PuL HPu
ϑ3 PuL HPu LSt
ϑ4 PuL HPu LSt HSt

Using our proposed cross-modal fusion prediction method (CMFP), we tested the clinical combination with DTI and DAT, and the AUC results are shown in Table 4. The yellow background area represents the AUC value predicted by the combination of clinical and DTI data, while the purple area represents the AUC value predicted by the combination of clinical and DAT data. The AUC values vary in different quantity feature selection. When selecting 4 clinical characteristics and 7 DTI characteristics, the results were relatively high, reaching 0.7791. Therefore, in the subsequent analysis, we will use this combination as the experimental object. However, the combined prediction effect of clinical and DAT was relatively poor. When selecting 2 clinical characteristics and DAT characteristics for prediction, the AUC was the highest, at 0.6. Using the method we proposed, the variance after predicting the combination of four clinical characteristics and seven DTI characteristics eight times is 0.1085. Meanwhile, after predicting the combination of two clinical characteristics and two DAT characteristics eight times, the variance is 0.1929. Relatively speaking, the combined prediction effect of clinical with DTI was better than that of clinical with DAT.

Table 4. The comparison of AUC values for the separate predictions combining clinical data with DTI and DAT.

δi, γi, and ϑi represent feature combinations for clinical, DTI, and DAT data, respectively. δi, γi, and ϑi represent the same as defined in Table 3. A: denotes the combination prediction of clinical and DTI, B: denotes the combination prediction of clinical and DAT.

A δ23 δ2 δ3 δ4 δ5
γ50 0.6409 0.4647 0.5539 0.4873 0.4859
γ2 0.4955 0.5113 0.5506 0.5102 0.6459
γ3 0.5315 0.5161 0.5053 0.5193 0.5858
γ4 0.5458 0.4963 0.5435 0.4566 0.6038
γ5 0.5288 0.4936 0.4623 0.6019 0.4905
γ6 0.4872 0.6439 0.7096 0.5681 0.6135
γ7 0.6677 0.5228 0.4717 0.7791 0.5172
B δ23 δ2 δ3 δ4 δ5
ϑ21 0.5759 0.4670 0.4144 0.4606 0.4960
ϑ2 0.5592 0.6000 0.5683 0.5694 0.5842
ϑ3 0.5591 0.5990 0.4982 0.4657 0.5146
ϑ4 0.5729 0.5207 0.5518 0.5130 0.5882

In Table 4, it can be seen that when combining the four characteristics of clinical data with the seven characteristics of DTI data for prediction, the AUC value obtained is the best among all the combinations of characteristics. Similarly, when the two characteristics of the clinical data are combined with the two characteristics of the DAT data for prediction, a better AUC value is also obtained. To further analyse the performance of the single-modality data prediction model and the dual-modality data prediction model, we calculated the MAE and the F1 score. Fig 2(a) and 2(b) respectively display the MAE and F1 Score evaluation results for the five cases of predictions. MAE is used to measure the average deviation between the predicted results and the actual observed values. A smaller MAE value indicates that the model’s predicted results are more consistent with the true values. F1 Score is a harmonic mean based on precision and recall, where a higher value indicates a better classification performance of the model.

Fig 2. The average and variance of MAE and F1 Score values for five predictions of PD progression, utilizing either single-modal or dual-modal data combinations.

Fig 2

In Fig 2(a), it can be seen that among the three single-modality data predictions, the use of DTI data achieves the best MAE performance. Among the two dual-modality fusion predictions, the combination of clinical data and DTI data has a smaller MAE. Furthermore, in Fig 2(b), it can be observed that of the five prediction results, the combination of clinical data and DTI data has the highest F1 score. At the same time, the F1 score for single-modality prediction is much lower than that for dual-modality prediction. We also calculated the variances of the different predictions in terms of the MAE and F1 Score metrics, as shown in Fig 2(c).

Evaluation of different fusion strategy

In our proposed model, we incorporated the use of Adaboost. In Fig 3, we present a comparison of the performance of our method with other prominent ML techniques (Logistic Regression (LR), Gaussian Naive Bayes (GaussianNB), Decision Tree (DT), Support Vector Machine (SVM), K-Nearest Neighbours (KNN), Random Forest (RF), Extra Trees (ExtraTree)). In the Fig 3, CMFP refers to the results obtained by the method proposed in this work. To comprehensively assess the performance of the model, we used four key performance metrics: ACC, RMSE, and MAE. ACC represents the ratio of correctly classified samples to the total number of samples. The closer the value is to 1, the higher the accuracy of the model predictions. RMSE measures the gap between predicted and actual values, with smaller values indicating a better model fit. MAE measures the average absolute difference between predicted and actual values, and smaller values signify better model performance. From the Figs 3 to 5, it is evident that our method demonstrates significant advantages across all four metrics. Especially in terms of ACC, our method outperforms the second-best RF algorithm by 6.73%. Furthermore, considering the numerical values of RMSE and MAE, compared to RF our algorithm is lower by 6.81% and 6.73%, respectively. These findings further confirm the outstanding performance of our method. In conclusion, our algorithm indeed exhibits remarkable superiority in predictive performance.

Fig 3. By replacing Adaboost with other machine learning methods and calculating four evaluation metrics, we compared the performance of the model.

Fig 3

Fig 5. By replacing Adaboost with other machine learning methods and calculating four evaluation metrics, we compared the performance of the model.

Fig 5

Comparison with existing methods

In recent years, numerous novel methods for predicting PD progression have emerged (as summarised in Table 5), with particularly notable advances in multimodal data fusion. Table 5 categorises these approaches into four types: clinical-genetic fusion, clinical-imaging fusion, clinical-genetic-imaging fusion, and clinical-biomarker-imaging fusion. Most studies used the publicly available PPMI dataset and traditional machine learning methods remained dominant in methodology. Among clinical-genetic fusion studies, the work of Chen et al. [24] achieved remarkable performance, which was supported by a large sample size to enhance statistical power and generalizability. It should be noted, however, that sample size alone does not determine predictive accuracy, as demonstrated by Liu et al. [25]. In clinical-imaging fusion, reference [25] reported better results, but their study used a private dataset, which may limit generalisability. In contrast, the method proposed in this study demonstrates the best performance in predicting 5-year progression, primarily through the fusion of clinical and neuroimaging data. The combination of clinical data and DTI yielded the most significant improvements, which will be discussed further in the following chapter.

Table 5. Comparison of methods for predicting PD progress using multimodal data fusion.

C-G: clinical and genetic, C-N: clinical and neuroimaging, C-G-N: clinical and genetic and neuroimaging, C-B-N: Clinical and biomarker and neuroimaging.

Fusion type Ref. Dataset: Subjects Method Time point Performance
C-G Redenšek et al. [26] Recruited participants: 220 PD Linear regression 5 year AUC: 0.71
Chen et al. [24] PPMI: 409 PD Logistic regression 5 year AUC: 0.8
Krishnagopal et al. [27] PPMI: 194 PD Network-based Trajectory Profile Clustering algorithm 4 year ACC: 0.72
C-N Jackson et al. [28] PPMI: 139 PD Logistic, Ridge regression 1 year AUC: 0.62
Tang et al. [29] PPMI: 69 PD ANNs 4 year ACC: 0.75
Salmanpour et al. [30] PPMI: 885 PD HMLS 4 year ACC: 0.792
Hu et al. [31] UK Biobank vision cohort: 66500 participants Logistic regression 5 year AUC: 0.717
Liu et al. [25] Ruijin Hospital Affiliated to Shanghai Jiao Tong University: 33 PD Logistic regression 6 months ACC: 0.8, AUC: 0.85
Ours PPMI: 123 PD CMFP 5 year AUC: 0.7791, ACC: 0.8077
C-G-N Sadaei et al. [32] PPMI, PDBP: 529 PD, 350 PD XGBoost 1,2 and 3 year AUC: 0.77, 0.76
C-B-N Kim et al. [33] PPMI: 393 PD Linear regression 4 year AUC: 0.755
Li et al. [34] PPMI: 73 HC, 158 PD LSVM, KNN, Bayes, LDA, Elastic Net 5 year AUC: 0.77, ACC: 0.78
Chen et al. [35] PPMI: 338 PD Cox regression 5 year AUC: 0.77

Discussion

Clinical significance analysis of data

The clinical significance of HYS changes is highly dependent on the specific stage in which they take place. Among them, the transition from Stage 2 to Stage 3 of HYS is of unique and paramount importance, acting as a pivotal clinical milestone in the progression of the disease [36]. This shift typically foreshadows more severe disability, a significantly elevated risk of falls, and alterations in treatment response for patients. In contrast, while changes from stage 1 to stage 2 of HYS indicate progression of symptoms, their impact on patient functional status and prognosis is generally less pronounced compared to the transition from Stage 2 to Stage 3. In addition, this change often occurs during the relatively early "honeymoon period" of the disease. Consequently, if a research cohort comprises a higher proportion of patients experiencing early stage (stage 1-2) changes, the "clinical weight" of their disease progression may be lower than that of a cohort with a greater number of patients undergoing Stage 2-3 transitions. Given these factors, this study grouped patients solely on the basis of whether there was progression of HYS (yes/no), without further distinguishing the specific stage of HYS at which progression occurred.

The HYS scale has certain limitations, with relatively low resolution. In particular, it is difficult for HYS Stage II to accurately capture subtle changes in motor function. In contrast, the MDS-UPDRS Part III score, as a continuous variable scale, can conduct quantitative assessments of specific motor signs (such as tremor, rigidity, bradykinesia, etc.), thus providing a more sensitive and objective means of monitoring the progression of the disease [37]. Given that this study focuses on the critical transition in disease staging (such as the important turning point from Stage II to Stage III), and the HYS scale, as an authoritative standard for clinical staging, has a stronger correlation with the long-term prognosis while also allowing standardised data collection in multicenter studies [36]. Therefore, it is entirely reasonable for this study to select the HYS scale.

DTI is a new MRI technique developed in recent years. It can use multiple diffusion-sensitive gradients of different sizes to show the strength of the diffusion capacity of water molecules in vivo and the direction of the diffusion movement. It focuses on observing the biochemical composition, microstructure and arrangement of tissues. This technique is the only imaging method to show the non-invasive diffusion characteristics of water molecules in living tissue. DTI can elucidate the pathological changes in the microstructure of cerebellar white matter in patients with PD. This imaging technique not only sensitively captures the patterns of damage to cerebellar white matter during the course of the disease, but also reveals deep associations with key clinical features of PD [6]. Additionally, DAT imaging visually displays the spatial distribution and functional status of dopamine transporters in the brain, directly reflecting the integrity of the nigrostriatal dopaminergic system. Given that dopaminergic dysfunction is the most central pathological feature of PD, the quantitative measures provided by DAT exhibit a strong correlation with the severity and progression stages of the disease. Structural characteristics derived from 3D T1-weighted imaging, such as brain volume and cortical thickness, also hold significant value in Parkinson’s disease research. However, in the early stages of PD, structural changes typically emerge later than the microstructural alterations of white matter detected by DTI and the functional abnormalities revealed by DAT. Therefore, in this study, we did not include 3D T1-weighted imaging in our analysis.

Some researchers explored the change in white matter based on the DTI study and found that compared to the control group, white matter damage in the frontal lobe of patients with PD-MCI was more significant and related to damage to general cognitive function and multiple cognitive domains [38,39]. For the complex and diverse neurological mechanisms of PD, multimodal magnetic resonance characteristics are integrated to perform a comprehensive evaluation and analysis from multiple levels and different perspectives, providing a more reasonable method to further improve the precision of the diagnosis of PD [4042]. Pyatigorskaya et al. [40] used a combination of the volume of the nigrae and the intensity characteristics of the signal of NMS-MRI and the partial anisotropy characteristics in DTI modes to achieve a diagnostic precision of 93% for PD. Lei et al. [43,44] created a PD classification prediction model based on multimodal medical imaging technology (mainly including MRI and DTI) to detect PD intelligently and perform clinical score prediction. Chougar et al. [45] used the volume and DTI of 13 brain regions as input layers, and used a supervised ML algorithm to accurately predict the classification of PD, PSP, and MSA-P. Kim et al. [46] focused their research on newly diagnosed PD patients, comparing them with healthy individuals of the same age. By computing and contrasting quantitative anisotropy values across the subcortical and cortical regions in both cohorts, the aim was to identify the areas of the brain most prominently affected during the early stages of PD. Shin et al. [47] studied the thickness of the MRI cortical to predict the transition from mild cognitive impairment to dementia in PD. This research demonstrated that magnetic resonance cortical thickness helps predict the transition from mild cognitive impairment to dementia in PD at the individual level, with better performance when combined with clinical data.

Combining the results of Tables 2 and 4, key clinical metrics that contribute to the prediction of PD can be identified, leading to inference of potential characteristics associated with the progression of PD. Among these, crucial clinical data characteristics include age, UPDRS Part III Score, UPDRS Total Score, and UPSIT score, as shown in Table 1.

Recently, the advent of reliable network characterisation techniques has made it possible to understand neurological disorders at the level of total brain connectivity. However, so far, few studies have used white matter data as the main object to predict the progression of PD. Wee et al. [48] proposed an effective web-based multivariate classification algorithm that uses white matter fiber data and accurately identifies patients with MCI from normal controls. The results suggest that the proposed classification framework could provide an alternative and complementary approach to the clinical diagnosis of brain structural changes associated with cognitive impairment, but this research was focused on Alzheimer’s disease. Huang et al. [7] adopted the elastomere consensus ranking (ENFCR) method based on networks to explore the potential of the baseline features of the structural connectivity of white matter obtained through DTI to predict the future development of MCI in newly diagnosed PD patients, which indicated that the structural connectivity of white matter plays a significant role in predicting the progression from PD to MCI. However, no significant biomarkers were identified. Zhang et al. [49] used the TBSS method to analyse FA of brain white matter (WM) in patients with PD versus healthy controls, finding that integrating WM lesion regions with clinical information significantly improves prediction accuracy for disease progression. However, this method requires manual intervention and there is uncertainty in the location of WM-lesion areas.

Advantages of cross modal data fusion prediction

In the model of this paper, aside from clinical features, seven features(Splenium of Corpus Callosum & Fornix (Column and Body of Fornix) & Inferior Cerebellar Peduncle (Left) & Superior Cerebellar Peduncle (Left) & Fornix (Crescent)/Stria Terminalis (Right) & Tapetum Right & Tapetum Left) were selected from DTI variables for brain white matter MD (50) data, as shown in Table 1. White matter at the individual level contributes to the prediction of PD progression and, when combined with clinical characteristics, improves the predictive performance of the model. The predictive model based on DTI’s global white matter features has an AUC of 0.44035 (variance is 0.0185), the model based solely on clinical features has an AUC of 0.4414 (variance is 0.0187), and the predictive model that combines both DTI’s white matter features and clinical features has an AUC of 0.7791 (variance is 0.1085).

Compared to single-modal approaches, multimodal fusion techniques can achieve information complementarity, capturing a more complete representation of brain changes in patients with PD. This allows for a holistic evaluation and analysis of the disease from various points of view, providing crucial information for a comprehensive diagnostic assessment.

Fig 4 displays five ROC curves, corresponding to five different scenarios: using clinical data alone, DTI data alone, DAT data alone, combining clinical and DTI data, and combining clinical and DAT data. In the figure, five distinct colours represent the prediction results obtained from these five different data scenarios. In Fig 4(a), based on the positional range of the ROC curves, the AUC values can be determined. It can be observed that when clinical, DTI, or DAT data are used individually, AUC values are relatively low. However, when cross-modal fusion of two types of image data (DTI/DAT) with clinical data is used to predict progression of PD, the predictive performance of single-modal methods can be improved. Upon comparison, it is found that the combination of clinical and DTI data yields an even better prediction, with a significant increase in the AUC value. In Fig 4(b), the shaded area represents the confidence interval. It can be seen that the size of the confidence interval is also positively correlated with the AUC value. There is a 95% probability that the overall parameter for the prediction combining clinical and DTI data falls within this range, indicating that DTI data are more beneficial than DAT data in predicting the progression of PD.

Fig 4. The comparison of different ROC in five cases. The shaded area in (b) represents the confidence interval.

Fig 4

Furthermore, as indicated in Tables 2, 4 and Fig 4, combining clinical data with DTI data for prediction performs better than combining clinical data with DAT data. We also performed a Mann-Whitney U test, which revealed that the combined prediction of clinical and DTI data is statistically significant compared to using clinical data alone, with a p-value of 9.183e-4.

To further compare the combined prediction results of clinical data with DTI and DAT, we conducted a comparative analysis by calculating the metrics ACC, Sensitivity, and Specificity, as shown in Table 6. The data in the table represent the average and variance values after eight tests, and it is evident that the combination of clinical and DTI consistently yields relatively higher values in all metrics. As indicated by the variance of each metric in the Table 6, the current model performance still exhibits certain fluctuations and requires further improvement. This performance variance reflects the uncertainty of model predictions under conditions of limited and heterogeneous data. Nevertheless, the core conclusions of this study remain robust, supported by statistical significance and consistent performance ranking. Future work should involve larger-scale, multi-center data to further validate the stability and reproducibility of the model.

Table 6. The comparison of the average (Avg) and variance (Vac) values for the four metrics in separate predictions combining clinical data with DTI and DAT.

Clin-Dti: denotes the combination prediction of clinical and DTI, Clin-Dat: denotes the combination prediction of clinical and DAT.

Avg AUC ACC Sensitivity Specificity
Clin-Dti 0.7791 0.8077 0.9464 0.7470
Clin-Dat 0.6000 0.7385 0.8679 0.6298
Vac AUC ACC Sensitivity Specificity
Clin-Dti 0.1085 0.1490 0.1417 0.1890
Clin-Dat 0.1929 0.0923 0.1041 0.1516

In summary, combined prediction of clinical and DTI data is more advantageous in predicting the progression of PD.

In order to demonstrate the superiority of our approach, we further analyse the AUC values under different scenarios of replacing machine learning methods. The AUC metric was employed to quantitatively assess the model’s discriminative performance, with values ranging from 0 to 1, with higher values indicating better model performance.

Fig 5 presents a comparison of the predictive performance of different fusion algorithms, the left side showing a comparison of AUC values and the right side showing the corresponding distributions of the ROC curves. Among the several machine learning algorithms listed, it can be observed that our method achieves higher AUC values and demonstrates better performance in terms of the ROC curves. Specifically, in Fig 5, although SVM ranks as the second-best algorithm, our method still outperforms it by 2.09%. In fact, our algorithm exhibits significant superiority in predictive performance. In Fig 5, CMFP refers to the results obtained by the method proposed in this work. Here, despite that SVM has the second highest AUC value among algorithms, our method still exceeds it by 2.09%.

DTI provides us with a unique perspective to gain insight into the microscopic structural changes of living tissues. Among them, the analysis of DTI along perivascular spaces (DTI-ALPS) metric has emerged in recent years as a promising method to assess human lymphatic system function, attracting widespread attention from academic and clinical fields [50]. The DTI-ALPS is derived by calculating the diffusion ratio along perivascular spaces (PVS) in the periventricular white matter, and this metric is regarded an indirect measure of lymphatic system status and function. Not only does it offer a new perspective for understanding the operating mechanisms of the lymphatic system, but it also provides a powerful tool for assessing the progression of neurodegenerative diseases and other neurological disorders.

The concept of DTI-ALPS was initially proposed by Taoka [51] in the research on Alzheimer’s disease. The research showed a significant positive correlation between ALPS along perivascular spaces and MMSE scores, implying that as the severity of AD increases, the rate of water diffusion along perivascular spaces decreases. This discovery laid a solid foundation for subsequent research in fields such as PD. In PD, DTI-ALPS has also demonstrated its significant value, with PD patients showing a markedly lower ALPS metric compared to healthy controls [50,5254]. More importantly, the differences in ALPS also reflect the relationship between lymphatic clearance, cognitive function, and disease severity in patients with PD [54]. Research has found that a lower baseline DTI-ALPS is closely associated with subsequent declines in cognitive ability and worsening of disease severity [54]. Therefore, DTI-ALPS has a considerable reference value in improving the accuracy of PD progression predictions.

To further validate the effectiveness of DTI-ALPS in predicting the progression of PD, we conducted related research. In our research, we calculated the ALPS values for the left cerebral hemisphere of subjects. We combined them with clinical data, using multiple metrics to evaluate the effectiveness of the prediction and the model. As shown in Fig 6(a), among the six metrics, the prediction effect using a combination of clinical data and ALPS was significantly better than using clinical data combined with DTI, with an improvement in ACC of 3.85%. Fig 6(b) presents the performance of a further analysis model using a combination of clinical data and ALPS values. We compared the Adaboost algorithm with other machine learning algorithms and found that our method outperformed others in terms of multiple evaluation metrics. In particular, the F1 score, which balances both precision and recall of classification models, was 1.52% higher for our method compared to the suboptimal method. Meanwhile, in terms of RMSE, our method was also 5.19% lower than the suboptimal algorithm.

Fig 6. (a) Clinical+DTI vs. DTI+ALPS comparison; (b) Replace Adaboost with other ML methods for comparison.

Fig 6

In summary, our research shows that a method that combines DTI white matter data with clinical characteristics helps predict PD progression more accurately. Furthermore, the construction of whole brain white matter features based on machine learning provides a new approach to assess and monitor disease severity in PD patients, helping in early clinical formulation of more effective PD treatment plans. Furthermore, with continued technological advancements and deeper research, methods based on DTI-ALPS will exhibit even broader application prospects in predicting the progression of PD.

Conclusion

The progression of PD disease is a long and progressive process, and its diagnosis depends mainly on clinical signs and symptoms. It is very difficult to treat the disease in the middle and late stages. Therefore, early prediction of the disease is crucial for the implementation of appropriate intervention measures. However, there is still a lack of reliable biomarkers for the progression of PD, so it is of great significance to establish a model for early prediction of disease progression. DTI can quantitatively measure the extent and direction of the diffusion of water molecules in the brain and assess the structural integrity and continuity of the brain’s white matter fibres so that it can indicate potential changes in the early stage of the disease. The method in this paper is constructed based on longitudinal data research, and the ML method is used to establish a disease progression model of PD by integrating baseline DTI and clinical characteristic data of early PD, which is expected to provide an imaging basis for early decision-making of PD.

This paper is based mainly on clinical data and DTI/DAT cross-modal fusion to predict PD progression. The main limitation is that this research has only been validated in public data sets and lacks external data validation. Although HYS changes vary in clinical significance in stages, this study did not distinguish the exact stage of progression. Although the selection of the HYS scale as a marker of disease progression in this study offers the advantage of universal applicability in clinical staging, its insufficient resolution may weaken the analysis of the correlation with continuous changes in motor function. Future studies should combine fine-grained scales such as MDS-UPDRS to validate the generalisability of this classifier. Additionally, data validation for the model was not performed using datasets from other sources. Our further research will combine multimodal data, such as grey matter structure data and gene data, to further study PD and other neurological diseases.

In this paper, numerous abbreviations are used, and their specific meanings can be found in the Table 7.

Table 7. Abbreviation and specific full name.

Abbreviation Specific name Abbreviation Specific name
PD Parkinson’s disease CMFP Cross-Modal Fusion Prediction Model
CMF Cross-Modal Fusion DTI Diffusion tensor imaging
DAT Dopamine transporter ALPS Along the perivascular space
FA Fractional Anisotropy DTI-ALPS the DTI metric along the perivascular space
MD Mean Diffusivity UPDRS Unified Parkinson’s Disease Rating Scale
ML Machine learning MDS-UPDRS Movement Disorder Society-sponsored revision of the UPDRS
HYS Hoehn and Yahr Scale PPMI Parkinson’s Progression Markers Initiative
MRI Magnetic Resonance Imaging FACT Fiber Assignment by Continuous Tracking
MSE Mean Squared Error ESS Epworth Sleepiness Scale
GDS Geriatric Depression Scale UPSIT University of Pennsylvania Smell Identification Test
MoCA Montreal Cognitive Assessment RBDSQ Rapid Eye Movement Sleep Behavior Disorder Screening Questionnaire
CSF Cerebrospinal Fluid ROC Receiver Operating Characteristic Curve
AUC Area Under the Curve SPE Specificity
SEN Sensitivity ACC Accuracy
MAE Mean Absolute Error RMSE Root Mean Square Error
LR Logistic Regression GaussianNB Gaussian Naive Bayes
DT Decision Tree SVM Support Vector Machine
KNN K-Nearest Neighbours RF Random Forest
ExtraTree Extra Trees NMS-MRI Neuromelanin-Sensitive Magnetic Resonance Imaging
PSP Progressive Supranuclear Palsy MSA-P Multiple System Atrophy with Predominant Parkinsonism
ENFCR Elastomere Consensus Ranking MCI Mild Cognitive Impairment
TBSS Tract-Based Spatial Statistics WM White Matter
PVS Perivascular Spaces

Data Availability

The format has been improved according to the requirements of the journal and the code has also been uploaded to Dryad Digital Repository (DOI: 10.5061/dryad.63xsj3vdv), which can be found at http://datadryad.org/share/zvyCwl-SD-ApTC7IwAXSLJJxA1W53rjXR_c1xPykaxM.

Funding Statement

This work was supported in part by the National Natural Science Foundation of China (No. 62072126, No. 62506086), in part by the Fundamental Research Projects Jointly Funded by Guangzhou Council and Municipal Universities No. SL2023A03J00639, in part by the Key Laboratory of Philosophy and Social Sciences in Guangdong Province of Maritime Silk Road of Guangzhou University (GD22TWCXGC15), in part by the Natural Science Foundation of Chongqing (No. CSTB2024NSCQ-MSX1087), and the Guangxi Science and Technology Program (No. AD23023001).

References

  • 1.Kumar T, Sharma P, Prakash N. Comparison of machine learning models for Parkinson’s Disease prediction. In: 2020 11th IEEE Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON). 2020. p. 195–9. 10.1109/uemcon51285.2020.9298033 [DOI]
  • 2.Agosta F, Weiler M, Filippi M. Propagation of pathology through brain networks in neurodegenerative diseases: from molecules to clinical phenotypes. CNS Neurosci Ther. 2015;21(10):754–67. doi: 10.1111/cns.12410 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Qu JZ. Neuroprotection and classification of neurologic dysfunction in aortic arch surgery: a narrative review. Heart and Mind. 2023;8(2):74–80. doi: 10.4103/hm.hm-d-23-00010 [DOI] [Google Scholar]
  • 4.Post B, Speelman JD, de Haan RJ, CARPA-Study Group. Clinical heterogeneity in newly diagnosed Parkinson’s disease. J Neurol. 2008;255(5):716–22. doi: 10.1007/s00415-008-0782-1 [DOI] [PubMed] [Google Scholar]
  • 5.Dadu A, Satone V, Kaur R, Hashemi SH, Leonard H, Iwaki H, et al. Identification and prediction of Parkinson’s disease subtypes and progression using machine learning in two cohorts. NPJ Parkinsons Dis. 2022;8(1):172. doi: 10.1038/s41531-022-00439-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Haghshomar M, Shobeiri P, Seyedi SA, Abbasi-Feijani F, Poopak A, Sotoudeh H, et al. Cerebellar microstructural abnormalities in Parkinson’s disease: a systematic review of diffusion tensor imaging studies. Cerebellum. 2022;21(4):545–71. doi: 10.1007/s12311-021-01355-3 [DOI] [PubMed] [Google Scholar]
  • 7.Huang X, He Q, Ruan X, Li Y, Kuang Z, Wang M, et al. Structural connectivity from DTI to predict mild cognitive impairment in de novo Parkinson’s disease. Neuroimage Clin. 2024;41:103548. doi: 10.1016/j.nicl.2023.103548 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Yang K, Wu Z, Long J, Li W, Wang X, Hu N, et al. White matter changes in Parkinson’s disease. NPJ Parkinsons Dis. 2023;9(1):150. doi: 10.1038/s41531-023-00592-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Schwarz ST, et al. Diffusion tensor imaging of nigral degeneration in Parkinson’s disease: a region-of-interest and voxel-based study at 3 T and systematic review with meta-analysis. NeuroImage: Clinical. 2013;3:481–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Amandola M, Sinha A, Amandola MJ, Leung H-C. Longitudinal corpus callosum microstructural decline in early-stage Parkinson’s disease in association with akinetic-rigid symptom severity. NPJ Parkinsons Dis. 2022;8(1):108. doi: 10.1038/s41531-022-00372-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Huang H, Perone F, Leung KSK, Ullah I, Lee Q, Chew N, et al. The utility of artificial intelligence and machine learning in the diagnosis of Takotsubo cardiomyopathy: a systematic review. Heart and Mind. 2024;8(3):165–76. doi: 10.4103/hm.hm-d-23-00061 [DOI] [Google Scholar]
  • 12.Kuang B, Tekin Y, Mouazen AM. Comparison between artificial neural network and partial least squares for on-line visible and near infrared spectroscopy measurement of soil organic carbon, pH and clay content. Soil and Tillage Research. 2015;146:243–52. doi: 10.1016/j.still.2014.11.002 [DOI] [Google Scholar]
  • 13.Nilashi M, Abumalloh RA, Yusuf SYM, Thi HH, Alsulami M, Abosaq H, et al. Early diagnosis of Parkinson’s disease: a combined method using deep learning and neuro-fuzzy techniques. Comput Biol Chem. 2023;102:107788. doi: 10.1016/j.compbiolchem.2022.107788 [DOI] [PubMed] [Google Scholar]
  • 14.Abdukodirov EI, Khalimova KhM, Matmurodov RJ. Hereditary-Genealogical features of Parkinson’s disease and their early detection of the disease. IJHS. 2022. doi: 10.53730/ijhs.v6ns1.5802 [DOI] [Google Scholar]
  • 15.Salmanpour MR, Shamsaei M, Saberi A, Klyuzhin IS, Tang J, Sossi V, et al. Machine learning methods for optimal prediction of motor outcome in Parkinson’s disease. Phys Med. 2020;69:233–40. doi: 10.1016/j.ejmp.2019.12.022 [DOI] [PubMed] [Google Scholar]
  • 16.Sadek RM, Mohammed SA, Abunbehan ARK, Ghattas AKHA, Badawi MR, Mortaja MN, et al. Parkinson’s Disease Prediction Using Artificial Neural Network. International Journal of Academic Health and Medical Research (IJAHMR). 2019;3(1):1–8. [Google Scholar]
  • 17.Govindu A, Palwe S. Early detection of Parkinson’s disease using machine learning. Procedia Computer Science. 2023;218:249–61. doi: 10.1016/j.procs.2023.01.007 [DOI] [Google Scholar]
  • 18.Wang M, Li Z, Lee EY, Lewis MM, Zhang L, Sterling NW, et al. Predicting the multi-domain progression of Parkinson’s disease: a Bayesian multivariate generalized linear mixed-effect model. BMC Med Res Methodol. 2017;17(1):147. doi: 10.1186/s12874-017-0415-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Ayaz Z, Naz S, Khan NH, Razzak I, Imran M. Automated methods for diagnosis of Parkinson’s disease and predicting severity level. Neural Comput & Applic. 2022. doi: 10.1007/s00521-021-06626-y [DOI] [Google Scholar]
  • 20.Makarious MB, Leonard HL, Vitale D, Iwaki H, Sargent L, Dadu A, et al. Multi-modality machine learning predicting Parkinson’s disease. NPJ Parkinsons Dis. 2022;8(1):35. doi: 10.1038/s41531-022-00288-w [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Yang Y, Ye C, Sun J, Liang L, Lv H, Gao L, et al. Alteration of brain structural connectivity in progression of Parkinson’s disease: a connectome-wide network analysis. Neuroimage Clin. 2021;31:102715. doi: 10.1016/j.nicl.2021.102715 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Jain S, Shetty S. Improving accuracy in noninvasive telemonitoring of progression of Parkinson’s disease using two-step predictive model. In: 2016 Third International Conference on Electrical, Electronics, Computer Engineering and their Applications (EECEA); 2016. p. 104–9.
  • 23.Anisha CD, Arulanand N. Early Prediction of Parkinson’s Disease (PD) using ensemble classifiers. In: 2020 International Conference on Innovative Trends in Information Technology (ICITIIT), 2020. 10.1109/icitiit49094.2020.9071562 [DOI]
  • 24.Chen J, Zhao D, Wang Q, Chen J, Bai C, Li Y, et al. Predictors of cognitive impairment in newly diagnosed Parkinson’s disease with normal cognition at baseline: a 5-year cohort study. Front Aging Neurosci. 2023;15:1142558. doi: 10.3389/fnagi.2023.1142558 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Liu Y, Xiao B, Zhang C, Li J, Lai Y, Shi F, et al. Predicting motor outcome of subthalamic nucleus deep brain stimulation for Parkinson’s disease using quantitative susceptibility mapping and radiomics: a pilot study. Front Neurosci. 2021;15:731109. doi: 10.3389/fnins.2021.731109 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Redenšek S, Jenko Bizjan B, Trošt M, Dolžan V. Clinical-pharmacogenetic predictive models for time to occurrence of levodopa related motor complications in Parkinson’s disease. Front Genet. 2019;10:461. doi: 10.3389/fgene.2019.00461 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Krishnagopal S, Coelln R von, Shulman LM, Girvan M. Identifying and predicting Parkinson’s disease subtypes through trajectory clustering via bipartite networks. PLoS One. 2020;15(6):e0233296. doi: 10.1371/journal.pone.0233296 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Jackson H, Anzures-Cabrera J, Taylor KI, Pagano G, Pasadena Investigators, Prasinezumab Study Group. Hoehn and yahr stage and striatal dat-SPECT uptake are predictors of Parkinson’s disease motor progression. Front Neurosci. 2021;15:765765. doi: 10.3389/fnins.2021.765765 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Tang J, Yang B, Adams MP, Shenkov NN, Klyuzhin IS, Fotouhi S, et al. Artificial neural network-based prediction of outcome in Parkinson’s Disease patients using DaTscan SPECT imaging features. Mol Imaging Biol. 2019;21(6):1165–73. doi: 10.1007/s11307-019-01334-5 [DOI] [PubMed] [Google Scholar]
  • 30.Salmanpour MR, Shamsaei M, Hajianfar G, Soltanian-Zadeh H, Rahmim A. Longitudinal clustering analysis and prediction of Parkinson’s disease progression using radiomics and hybrid machine learning. Quant Imaging Med Surg. 2022;12(2):906–19. doi: 10.21037/qims-21-425 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Hu W, Wang W, Wang Y, Chen Y, Shang X, Liao H, et al. Retinal age gap as a predictive biomarker of future risk of Parkinson’s disease. Age Ageing. 2022;51(3):afac062. doi: 10.1093/ageing/afac062 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Sadaei HJ, Cordova-Palomera A, Lee J, Padmanabhan J, Chen S-F, Wineinger NE, et al. Genetically-informed prediction of short-term Parkinson’s disease progression. NPJ Parkinsons Dis. 2022;8(1):143. doi: 10.1038/s41531-022-00412-w [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Kim R, Lee J, Kim H-J, Kim A, Jang M, Jeon B, et al. CSF β-amyloid42 and risk of freezing of gait in early Parkinson disease. Neurology. 2019;92(1):e40–7. doi: 10.1212/WNL.0000000000006692 [DOI] [PubMed] [Google Scholar]
  • 34.Li Y, Huang X, Ruan X, Duan D, Zhang Y, Yu S, et al. Baseline cerebral structural morphology predict freezing of gait in early drug-naïve Parkinson’s disease. NPJ Parkinsons Dis. 2022;8(1):176. doi: 10.1038/s41531-022-00442-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Chen J, Chen B, Zhao D, Feng X, Wang Q, Li Y, et al. Predictors for early-onset psychotic symptoms in patients newly diagnosed with Parkinson’s disease without psychosis at baseline: a 5-year cohort study. CNS Neurosci Ther. 2024;30(3):e14651. doi: 10.1111/cns.14651 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Ygland Rödström E, Puschmann A. Clinical classification systems and long-term outcome in mid- and late-stage Parkinson’s disease. NPJ Parkinsons Dis. 2021;7(1):66. doi: 10.1038/s41531-021-00208-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Skorvanek M, Martinez-Martin P, Kovacs N, Rodriguez-Violante M, Corvol J-C, Taba P, et al. Differences in MDS-UPDRS scores based on Hoehn and Yahr stage and disease duration. Mov Disord Clin Pract. 2017;4(4):536–44. doi: 10.1002/mdc3.12476 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Wang W, Mei M, Gao Y, Huang B, Qiu Y, Zhang Y, et al. Changes of brain structural network connection in Parkinson’s disease patients with mild cognitive dysfunction: a study based on diffusion tensor imaging. J Neurol. 2020;267(4):933–43. doi: 10.1007/s00415-019-09645-x [DOI] [PubMed] [Google Scholar]
  • 39.Taylor JL. Exercise and the brain in cardiovascular disease: a narrative review. Heart and Mind. 2023;7(1):5–12. doi: 10.4103/hm.hm_50_22 [DOI] [Google Scholar]
  • 40.Pyatigorskaya N, Magnin B, Mongin M, Yahia-Cherif L, Valabregue R, Arnaldi D, et al. Comparative study of MRI biomarkers in the substantia Nigra to discriminate idiopathic Parkinson Disease. AJNR Am J Neuroradiol. 2018;39(8):1460–7. doi: 10.3174/ajnr.A5702 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Bowman FD, Drake DF, Huddleston DE. Multimodal imaging signatures of Parkinson’s Disease. Front Neurosci. 2016;10:131. doi: 10.3389/fnins.2016.00131 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Shih Y-C, Tseng W-YI, Montaser-Kouhsari L. Recent advances in using diffusion tensor imaging to study white matter alterations in Parkinson’s disease: a mini review. Front Aging Neurosci. 2023;14:1018017. doi: 10.3389/fnagi.2022.1018017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Lei H, Huang Z, Elazab A, Li H, Lei B. Longitudinal and multi-modal data learning via joint embedding and sparse regression for Parkinson’s disease diagnosis. In: International Workshop on Machine Learning in Medical Imaging. 2018. p. 310–8.
  • 44.Lei H, Huang Z, Zhang J, Yang Z, Tan E-L, Zhou F, et al. Joint detection and clinical score prediction in Parkinson’s disease via multi-modal sparse learning. Expert Systems with Applications. 2017;80:284–96. doi: 10.1016/j.eswa.2017.03.038 [DOI] [Google Scholar]
  • 45.Chougar L, Faouzi J, Pyatigorskaya N, Yahia-Cherif L, Gaurav R, Biondetti E, et al. Automated categorization of Parkinsonian syndromes using magnetic resonance imaging in a clinical setting. Mov Disord. 2021;36(2):460–70. doi: 10.1002/mds.28348 [DOI] [PubMed] [Google Scholar]
  • 46.Kim J-Y, Shim J-H, Baek H-M. White matter microstructural alterations in newly diagnosed Parkinson’s Disease: a whole-brain analysis using dMRI. Brain Sci. 2022;12(2):227. doi: 10.3390/brainsci12020227 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Shin N-Y, Bang M, Yoo S-W, Kim J-S, Yun E, Yoon U, et al. Cortical thickness from MRI to predict conversion from mild cognitive impairment to dementia in Parkinson Disease: a machine learning-based model. Radiology. 2021;300(2):390–9. doi: 10.1148/radiol.2021203383 [DOI] [PubMed] [Google Scholar]
  • 48.Wee C-Y, Yap P-T, Li W, Denny K, Browndyke JN, Potter GG, et al. Enriched white matter connectivity networks for accurate identification of MCI patients. Neuroimage. 2011;54(3):1812–22. doi: 10.1016/j.neuroimage.2010.10.026 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Zhang Q, Wang H, Shi Y, Li W. White matter biomarker for predicting de novo Parkinson’s disease using tract-based spatial statistics: a machine learning-based model. Quant Imaging Med Surg. 2024;14(4):3086–106. doi: 10.21037/qims-23-1478 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Yao J, Huang T, Tian Y, Zhao H, Li R, Yin X, et al. Early detection of dopaminergic dysfunction and glymphatic system impairment in Parkinson’s disease. Parkinsonism Relat Disord. 2024;127:107089. doi: 10.1016/j.parkreldis.2024.107089 [DOI] [PubMed] [Google Scholar]
  • 51.Taoka T, Masutani Y, Kawai H, Nakane T, Matsuoka K, Yasuno F, et al. Evaluation of glymphatic system activity with the diffusion MR technique: diffusion tensor image analysis along the perivascular space (DTI-ALPS) in Alzheimer’s disease cases. Jpn J Radiol. 2017;35(4):172–8. doi: 10.1007/s11604-017-0617-z [DOI] [PubMed] [Google Scholar]
  • 52.Chen HL, Chen PC, Lu CH, Tsai NW, Yu CC, Chou KH, et al. Associations among Cognitive functions, plasma DNA, and diffusion tensor image along the perivascular space (DTI-ALPS) in patients with Parkinson’s Disease. Oxidative Medicine and Cellular Longevity. 2021;2021(1):4034509:1–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Meng J-C, Shen M-Q, Lu Y-L, Feng H-X, Chen X-Y, Xu D-Q, et al. Correlation of glymphatic system abnormalities with Parkinson’s disease progression: a clinical study based on non-invasive fMRI. J Neurol. 2024;271(1):457–71. doi: 10.1007/s00415-023-12004-6 [DOI] [PubMed] [Google Scholar]
  • 54.Wood KH, Nenert R, Miften AM, Kent GW, Sleyster M, Memon RA, et al. Diffusion tensor imaging-along the perivascular-space index is associated with disease progression in Parkinson’s Disease. Mov Disord. 2024;39(9):1504–13. doi: 10.1002/mds.29908 [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision Letter 0

Nima Broomand Lomer

25 Jun 2025

PONE-D-25-29378Cross-modal fusion of brain imaging and clinical data for Parkinson's disease progression predictionPLOS ONE

Dear Dr. Wen,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

The reviewers' comments for the authors are provided below. If there are any comments you are unable or choose not to address, please include an explanation. While it is not mandatory to implement every suggestion, the feedback from the reviewers and editor is intended to help improve the overall quality of your manuscript and should be carefully considered. We would be pleased to reconsider your manuscript should you choose to submit a revised version.

Please submit your revised manuscript by Aug 09 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Nima Broomand Lomer, M.D.

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Please note that PLOS One has specific guidelines on code sharing for submissions in which author-generated code underpins the findings in the manuscript. In these cases, we expect all author-generated code to be made available without restrictions upon publication of the work. Please review our guidelines at https://journals.plos.org/plosone/s/materials-and-software-sharing#loc-sharing-code and ensure that your code is shared in a way that follows best practice and facilitates reproducibility and reuse.

3. We note that the grant information you provided in the ‘Funding Information’ and ‘Financial Disclosure’ sections do not match.

When you resubmit, please ensure that you provide the correct grant numbers for the awards you received for your study in the ‘Funding Information’ section.

4. Thank you for stating the following financial disclosure:

“This work was supported in part by the National Natural Science Foundation of China (No. 62072126), in part by the Fundamental Research Projects Jointly Funded by Guangzhou Council and Municipal Universities No. SL2023A03J00639, in part by the Key Laboratory of Philosophy and Social Sciences in Guangdong Province of Maritime Silk Road of Guangzhou University (GD22TWCXGC15), in part by the Natural Science Foundation of Chongqing (No. CSTB2024NSCQ-MSX1087), and the Guangxi Science and Technology Program (No.AD23023001).”

Please state what role the funders took in the study. If the funders had no role, please state: "The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript."

If this statement is not correct you must amend it as needed.

Please include this amended Role of Funder statement in your cover letter; we will change the online submission form on your behalf.

5. Thank you for stating in your Funding Statement:

“This work was supported in part by the National Natural Science Foundation of China (No. 62072126), in part by the Fundamental Research Projects Jointly Funded by Guangzhou Council and Municipal Universities No. SL2023A03J00639, in part by the Key Laboratory of Philosophy and Social Sciences in Guangdong Province of Maritime Silk Road of Guangzhou University (GD22TWCXGC15), in part by the Natural Science Foundation of Chongqing (No. CSTB2024NSCQ-MSX1087), and the Guangxi Science and Technology Program (No.AD23023001).”

Please provide an amended statement that declares *all* the funding or sources of support (whether external or internal to your organization) received during this study, as detailed online in our guide for authors at http://journals.plos.org/plosone/s/submit-now.  Please also include the statement “There was no additional external funding received for this study.” in your updated Funding Statement.

Please include your amended Funding Statement within your cover letter. We will change the online submission form on your behalf.

6. Thank you for stating the following in the Acknowledgments Section of your manuscript:

“This work was supported in part by the National Natural Science Foundation of China (No. 62072126), in part by the Fundamental Research Projects Jointly Funded by Guangzhou Council and Municipal Universities No. SL2023A03J00639, in part by the Key Laboratory of Philosophy and Social Sciences in Guangdong Province of Maritime Silk Road of Guangzhou University (GD22TWCXGC15), in part by the Natural Science Foundation of Chongqing (No. CSTB2024NSCQ-MSX1087), and the Guangxi Science and Technology Program (No.AD23023001).”

We note that you have provided funding information that is not currently declared in your Funding Statement. However, funding information should not appear in the Acknowledgments section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form.

Please remove any funding-related text from the manuscript and let us know how you would like to update your Funding Statement. Currently, your Funding Statement reads as follows:

“This work was supported in part by the National Natural Science Foundation of China (No. 62072126), in part by the Fundamental Research Projects Jointly Funded by Guangzhou Council and Municipal Universities No. SL2023A03J00639, in part by the Key Laboratory of Philosophy and Social Sciences in Guangdong Province of Maritime Silk Road of Guangzhou University (GD22TWCXGC15), in part by the Natural Science Foundation of Chongqing (No. CSTB2024NSCQ-MSX1087), and the Guangxi Science and Technology Program (No.AD23023001).”

Please include your amended statements within your cover letter; we will change the online submission form on your behalf.

7. Please include a separate caption for each figure in your manuscript.

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Partly

Reviewer #3: No

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: No

Reviewer #3: No

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: No

Reviewer #3: No

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: 1.Clinical Information of Study Subjects

Please provide a demographic table summarizing the clinical characteristics of the Parkinson’s disease patients included in this study. In particular, information for both the HYS deterioration group and the non-deterioration group at baseline and at the 5-year follow-up is necessary to allow for a clearer understanding of cohort composition and comparability.

2. Clinical Significance of HYS Changes at Different Stages

This study categorizes patients based on the presence or absence of Hoehn and Yahr Scale (HYS) progression. However, the clinical significance of changes in HYS scores depends heavily on the specific stages involved. For example, a change from stage 1 to 2 may simply reflect a transition within the early, so-called “honeymoon” phase, while a shift from stage 2 to 3 typically indicates entry into the more progressive stage of the disease. I recommend addressing this issue in the discussion or including it as part of the limitations of the study.

3. Limitations of HYS as an Assessment Scale

The HYS is a relatively coarse measure of disease severity. In clinical practice, many PD patients are classified as stage II, but the degree of motor symptom severity within this group can vary widely. Therefore, many previous studies use MDS-UPDRS Part III scores as a more granular and sensitive indicator of motor function. Please consider discussing this limitation of the HYS in light of the metrics used in prior literature, particularly as it relates to your classifier’s outcome and feature selection.

4. Comparison with Previous Studies

While this study proposes a novel machine learning classifier to predict PD progression, there is a large body of existing literature with similar goals. However, the current manuscript lacks a comparative analysis of the proposed model’s performance relative to those of prior studies. I strongly recommend including a discussion of how your model’s AUC and other evaluation metrics compare to previously published classifiers, to contextualize the novelty and value of your findings.

5. Rationale for Feature Selection

Although clinical features were selected using the Lasso method, the imaging modalities (DTI and DAT SPECT) were chosen a priori. The rationale for restricting imaging feature selection to DTI and DAT SPECT is not clearly explained. Given that many previous studies have employed structural features extracted from 3D T1-weighted images—such as brain volume and cortical thickness—it would be worth discussing whether incorporating such features could potentially improve classifier performance.

Reviewer #2: The authors aimed to develop a model based on MRI/DTI scans to provide a reliable biomarker useful for prediction of PD at its early stage. With the PPMI database of multi-center PD patients, the proposed cross-modality fusion prediction method (CMFP) appears superior in PD-progression prediction performance compared to single-modality approach. Machine learning is an increasing promising tool for clinic diagnosis; however, the comparative predictive values of its use combined with MRI/DTI remain unclear. In this sense, the manuscript provided a badly needed model to this end. I’ve several major and minor concerns regarding this manuscript (see below).

Major concerns:

1. The authors reported their results in a very casual way. They basically skipped the reporting of the Figure 4 and the Figure 5 in the Results (lines 260), which was then brought into a detailed description in the Discussion. As a result, the Discussion session was mixed with results and discussion;

2. It’s very confusing that it appears the CMFP is the prediction model the authors were trying to sell, yet CMFP was rarely seen throughout the manuscript (not once in the Discussion). In contrast, the authors focused on the Decision Fusion Predictive Model (line 96);

3. Many multi-modal prediction models exist as the author mentioned (refs 25-29). The rationale and mechanisms the CMFP were superior to other models is not clear in this manuscript. Have the authors applied these models on the PPMI dataset used in this study? Have they authors applied their CMFP model to other PD dataset(s)? The comparison was missing in the Results. But at least these should be mentioned in the Discussion.

Minor concerns:

1. The writing and organization of the manuscript is chaotic, specifically: 1) the long passage (lines 16-51) in the Introduction session is better to split into two parts, 2) using the bullet points in the Introduction is discouraged, 3) Python and Pytorch should be in the Methods session (lines 199-200);

2. The English/grammar need to be improved/corrected significantly throughout the manuscript. It’s difficult to read and follow. For example, the result summary titles (the lines 206, 248) were illy composed. In the line 359, “AUC used to measure the model’s classification ability.” And many more throughout the text and legends;

3. The statistically significance level was not specified. What main statistic method was used for group comparison (other than Mann-Whitney U test)? How was the statistics performed? How were the age and sex controlled?

4. In the Table 5, full names of SEN and SPE should be used. No need to abbreviate;

5. Remove the numbering in the Methods and Results sessions;

6. Remove the redundant abbreviation of PD in the line 274;

7. The manuscript images seem fuzzy and hard to read. Were the authors using appropriate dpi?

Reviewer #3: Researchers attempted to predict Parkinson's disease progression using three data modalities (clinical, DTI imaging, DaTscan) combined through cross-modal fusion to improve upon single-modality predictions. Although the methodology appears rigorous, fundamental questions remain unanswered.

1. Serious methodological problems:

i. Training for 120 epochs with only 123 total patients (even fewer after data splitting) will inevitably cause the model to memorize training examples rather than learn generalizable patterns. Why was such high epoch used?

ii. How is AdaBoost integrated with Adam optimizer? AdaBoost operates through weighted voting without gradient descent, while Adam is specifically designed for gradient-based neural network optimization - these approaches are fundamentally incompatible.

iii. How is a classification model using mean squared error loss? Classification problems require cross-entropy loss, not regression losses like MSE that are designed for continuous value prediction.

iv. Feature selection performed before data splitting is fundamentally wrong and promotes data leakage, where the selection process "sees" test data and artificially inflates performance estimates.

v. There is a sample bias between positive and the negative class. Was the split stratified?

2. Other problems:

i. Variables in equations 1-6 (ρ, ψ, τ, μ, ϕ, λ, ι, α, υ) lack proper definition - what do these variables represent and how do they relate to the fusion process?

ii. No nested cross-validation, no external validation, confidence intervals missing.

iii. Authors show only one p-value which signifies one statistical test but the authors comment about multiple tests being performed. Where is the multiple comparison test?

iv. Claims such as novel CMF without proper citations. There are multiple claims without citations.

3. Grammar:

i. Poor grammar encountered numerous times in the paper.

ii. They write "using imaging-omics, particularly clinical data" which is contradictory since clinical data typically isn't considered part of imaging-omics.

iii. The authors inconsistently refer to their method as "CMFP," "CMF," and "cross-modal fusion" without clearly establishing these as equivalent terms.

iv. Claims like "DTI metric along the perivascular space has relatively more advantages" is vague without specifying compared to what.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: kazuhide seo

Reviewer #2: No

Reviewer #3: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2025 Nov 20;20(11):e0333822. doi: 10.1371/journal.pone.0333822.r002

Author response to Decision Letter 1


23 Jul 2025

The reviewer’s comments and suggestions have been carefully addressed in the revised version of the paper. Changes in the revised version have been colored red. We are grateful once again to the associate editor and reviewers for their time and helpful comments, which have played a vital role in improving the quality and presentation of the original manuscript.

Attachment

Submitted filename: Response to Reviewers.docx

pone.0333822.s001.docx (31.8KB, docx)

Decision Letter 1

Nima Broomand Lomer

21 Aug 2025

PONE-D-25-29378R1Cross-modal fusion of brain imaging and clinical data for Parkinson's disease progression predictionPLOS ONE

Dear Dr. Wen,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please address the concerns raised by Reviewer 3 thoroughly and resubmit the manuscript for reevaluation. If any comments cannot be addressed, provide a clear justification.

Please submit your revised manuscript by Oct 05 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Nima Broomand Lomer, M.D.

Academic Editor

PLOS ONE

Journal Requirements:

1. If the reviewer comments include a recommendation to cite specific previously published works, please review and evaluate these publications to determine whether they are relevant and should be cited. There is no requirement to cite these works unless the editor has indicated otherwise. 

2. Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

Reviewer #3: (No Response)

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Partly

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: No

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: No

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: I have carefully reviewed the authors' detailed responses and the revised manuscript. Thank you for thoroughly and appropriately addressing all of the reviewer comments I raised. I have no further comments or concerns.

Reviewer #2: The authors have addressed all my concerns. I have no further questions. With that said, the revised Discussion is quite lengthy and the authors should make it more relevant and more precise.

Reviewer #3: Thanks to the authors for the changes. Some of the problems were adequately answered but the paper remains confusing.

Major:

1. Single-modality AUC values (0.40-0.50) are worse than random chance, indicating a methodological issue.

2. Patient number mismatch: "…it was found that 74 patients had scores higher than baseline…" vs "…progression group (n = 72)…".

3. Missing confidence intervals for AUC values in Tables 2, 4, and 6. With reported variance of 0.1085, and without confidence intervals the reliability of the 0.7791 AUC cannot be assessed.

4. High performance variance (0.1085) indicates unstable results but no discussion of reproducibility implications.

Minor:

1. CMFP/CFMP, CMF/CFM used interchangeably.

2. "Multivariate regression analysis further confirmed a negative…" missing a citation.

3. ACC, AUC, ESS, RBDSQ, UPSIT, etc undefined in introduction. Please provide a table for all the abbreviations.

4. In table 2 ' tau i' should explicitly mention that 'i' is just the corresponding 'i' as given in table 3. Otherwise it looks confusing. Or present table 3 before 2.

5. "The clinical significance of HYS…" the first three paragraphs of discussion, present claims without citations.

6. "four clinical data" in methods should be "four clinical variables" and other grammatical errors throughout.

7. "Larger sample size for improved prediction" in existing methods comparison. Larger sample size reduces variance but doesn't guarantee better performance - Liu et al. (citation 25) achieved higher AUC/ACC scores with only 33 PD patients.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: kazuhide seo

Reviewer #2: No

Reviewer #3: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2025 Nov 20;20(11):e0333822. doi: 10.1371/journal.pone.0333822.r004

Author response to Decision Letter 2


16 Sep 2025

The reviewer’s comments and suggestions have been carefully addressed in the revised version of the paper. Changes in the revised version have been colored red. We are grateful once again to the associate editor and reviewers for their time and helpful comments, which have played a vital role in improving the quality and presentation of the original manuscript.

Attachment

Submitted filename: Response_to_Reviewers_auresp_2.docx

pone.0333822.s002.docx (18.8KB, docx)

Decision Letter 2

Nima Broomand Lomer

18 Sep 2025

Cross-modal fusion of brain imaging and clinical data for Parkinson's disease progression prediction

PONE-D-25-29378R2

Dear Dr. Wen,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

All concerns have been successfully addressed, and the manuscript has been significantly improved.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice will be generated when your article is formally accepted. Please note, if your institution has a publishing partnership with PLOS and your article meets the relevant criteria, all or part of your publication costs will be covered. Please make sure your user information is up-to-date by logging into Editorial Manager at Editorial Manager® and clicking the ‘Update My Information' link at the top of the page. For questions related to billing, please contact billing support.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Nima Broomand Lomer, M.D.

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Acceptance letter

Nima Broomand Lomer

PONE-D-25-29378R2

PLOS ONE

Dear Dr. Wen,

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now being handed over to our production team.

At this stage, our production department will prepare your paper for publication. This includes ensuring the following:

* All references, tables, and figures are properly cited

* All relevant supporting information is included in the manuscript submission,

* There are no issues that prevent the paper from being properly typeset

You will receive further instructions from the production team, including instructions on how to review your proof when it is ready. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few days to review your paper and let you know the next and final steps.

Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

You will receive an invoice from PLOS for your publication fee after your manuscript has reached the completed accept phase. If you receive an email requesting payment before acceptance or for any other service, this may be a phishing scheme. Learn how to identify phishing emails and protect your accounts at https://explore.plos.org/phishing.

If we can help with anything else, please email us at customercare@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Nima Broomand Lomer

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    Attachment

    Submitted filename: Response to Reviewers.docx

    pone.0333822.s001.docx (31.8KB, docx)
    Attachment

    Submitted filename: Response_to_Reviewers_auresp_2.docx

    pone.0333822.s002.docx (18.8KB, docx)

    Data Availability Statement

    The format has been improved according to the requirements of the journal and the code has also been uploaded to Dryad Digital Repository (DOI: 10.5061/dryad.63xsj3vdv), which can be found at http://datadryad.org/share/zvyCwl-SD-ApTC7IwAXSLJJxA1W53rjXR_c1xPykaxM.


    Articles from PLOS One are provided here courtesy of PLOS

    RESOURCES