Skip to main content
PLOS ONE logoLink to PLOS ONE
. 2021 Jun 28;16(6):e0245026. doi: 10.1371/journal.pone.0245026

Estimating pulse wave velocity from the radial pressure wave using machine learning algorithms

Weiwei Jin 1,*, Philip Chowienczyk 2, Jordi Alastruey 1,3
Editor: Alberto Milan4
PMCID: PMC8238176  PMID: 34181640

Abstract

One of the European gold standard measurement of vascular ageing, a risk factor for cardiovascular disease, is the carotid-femoral pulse wave velocity (cfPWV), which requires an experienced operator to measure pulse waves at two sites. In this work, two machine learning pipelines were proposed to estimate cfPWV from the peripheral pulse wave measured at a single site, the radial pressure wave measured by applanation tonometry. The study populations were the Twins UK cohort containing 3,082 subjects aged from 18 to 110 years, and a database containing 4,374 virtual subjects aged from 25 to 75 years. The first pipeline uses Gaussian process regression to estimate cfPWV from features extracted from the radial pressure wave using pulse wave analysis. The mean difference and upper and lower limits of agreement (LOA) of the estimation on the 924 hold-out test subjects from the Twins UK cohort were 0.2 m/s, and 3.75 m/s & -3.34 m/s, respectively. The second pipeline uses a recurrent neural network (RNN) to estimate cfPWV from the entire radial pressure wave. The mean difference and upper and lower LOA of the estimation on the 924 hold-out test subjects from the Twins UK cohort were 0.05 m/s, and 3.21 m/s & -3.11m/s, respectively. The percentage error of the RNN estimates on the virtual subjects increased by less than 2% when adding 20% of random noise to the pressure waveform. These results show the possibility of assessing the vascular ageing using a single peripheral pulse wave (e.g. the radial pressure wave), instead of cfPWV. The proposed code for the machine learning pipelines is available from the following online depository (https://github.com/WeiweiJin/Estimate-Cardiovascular-Risk-from-Pulse-Wave-Signal).

Introduction

Vascular ageing is a result of the age-induced damage inflicted upon the vascular structure and function, which leads to increased risk of chronic diseases, such as cardiovascular disease (CVD), and type 2 diabetes [1, 2]. Reducing the risk factors related to vascular ageing (e.g. blood pressure, glycemia, and lipids) at an early stage could prevent further progression of the disease [3]. Further studies have also shown that vascular ageing is associated with lifestyles [4] and exercise [5]. Thus, detecting vascular ageing at an early stage can lead to early intervention and prevention of the relevant diseases.

Studies have shown that arterial stiffening as a result of lacking compliance function, is one of the main players in vascular ageing [6, 7]. It has been suggested that arterial stiffness can be evaluated through the measurement of pulse wave velocity (PWV) [8, 9], for which the European standard assessment is the carotid-femoral PWV (cfPWV) [10]. Despite its wide use, cfPWV requires measurements at two arterial sites, manually handling the probes, and estimating the distance between the carotid and femoral arteries, which makes the measurement operator dependent. A single-site and automated measurement could overcome the limitations of the current clinical assessment of arterial stiffening.

Machine learning methods have been applied to solve a range of medical challenges, including detecting CVD. The majority of the machine learning research involving medical signals is based on either electrocardiogram (ECG) [11, 12] or photoplethysmogram (PPG) [13] data. Those studies mainly focused on critical CVD that could lead to mortality, such as heart failure [14, 15]. Whereas, the development of CVD is a long process, and early detection and intervention can stop disease progression and avoid expensive medical cost and mortality [16]. Using machine learning methods to detect earlier signs of CVD would be beneficial in improving cardiovascular health. Although little effort has been carried out to assess the CVD risk via machine learning methods, researchers have recently become engaged in the subject. For instance, a recent study has proposed a potential algorithm to estimate the size of an abdominal aortic aneurysm from pressure waves measured at carotid, brachial and femoral arteries using deep learning models [17]. In vascular ageing research, Tavallali et al. used an artificial neural network to estimate cfPWV with an RMSE of 1.1244 m/s. However, their approach required a central pressure wave, the carotid pressure wave, and also included other medical record information, such as chronological age [18].

This study aims to estimate cfPWV (hereafter referred to as PWV) from only the pulse wave measured at a single peripheral site (i.e. the radial artery) using machine learning algorithms. The following three case studies are considered. Case Study 1 proposes a machine learning pipeline that uses Gaussian process regression to estimate PWV from key features (timing and magnitude of the fiducial points and the heart rate) extracted from the radial pressure wave measured in the Twins UK cohort. Case Study 2 presents a second machine learning pipeline that uses a recurrent neural network (RNN) with long short-term memory (LSTM) to estimate PWV from the entire radial pressure waveform, also on the Twins UK cohort. Case Study 3 assesses the ability of the RNN model to estimate PWV from the radial pressure waveform from a database of virtual subjects, with random noise added. Both machine learning pipelines presented in this article are available from the following online depository (https://github.com/WeiweiJin/Estimate-Cardiovascular-Risk-from-Pulse-Wave-Signal).

Case Study 1: PWV estimation from radial pressure wave features

Methods

Study population

The study population in Case Study 1 consisted of 3,082 unselected twins (99% are females) from the Twins UK cohort. The mean and standard deviation of the biological characteristics of these subjects can be found in Table 1. The study was approved by the St Thomas’ Hospital Research Ethics Committees, and all subjects signed the written informed consent. Most of the measurement data from the Twins UK cohort are available for external researchers via an application. More information about this cohort can be found on its official website (https://twinsuk.ac.uk) and relevant publications [19, 20]. The data used in this case study were the radial pressure waves measured by applanation tonometry and cfPWV measured by SphygmoCor CvMS. The data were acquired by an experienced operator over the period 2006 to 2017.

Table 1. Biological characteristics of the subjects from the Twins UK cohort (N = 3,082).
Mean ± SD
Height (cm) 163.2 ± 22.5
Weight (kg) 69.2 ± 27.1
BMI (kg/m2) 26.2 ± 18.1
Age (year) 57.8 ± 12.8
DBP (mmHg) 74.1 ± 8.9
SBP (mmHg) 126.5 ± 17.5
MAP (mmHg) 93.6 ± 11.9
PWV (m/s) 9.39 ± 2.18

SD: Standard Deviation; BMI: body mass index; DBP: diastolic blood pressure; SBP: systolic blood pressure; MAP: mean arterial pressure; PWV: pulse wave velocity.

Wave feature extraction

The features of the radial pressure wave were extracted as the timings and magnitudes of the fiducial points identified on the waveform and the heart rate using the pulse wave analyser developed by Charlton et al. [21]. In total, 14 fiducial points on each waveform were identified, which made the numbers of the features from one radial pressure wave to be 29. More detailed descriptions of the fiducial points can be found in the previous studies by Charlton et al. [21, 22].

Preprocessing for Gaussian process regression

Before performing the Guassian process regression, LASSO regression was performed to identify the key features from all extracted features of the waveform. Then principal component analysis (PCA) was performed after LASSO regression to exclude outliners in the analysed dataset, as the outliers could affect the accuracy of machine learning algorithms [23]. The linear model module from the scikit-learn package was used to perform the LASSO regression in Python. The hyperparameter in the model was found by 5 fold cross-validation using the GridSearchCV library. Then, PCA was performed on the key features that were identified by the LASSO regression using the PCA library from the scikit-learn package. Finally, based on the distance of the data points away from the origin, outliers were identified and excluded from the machine learning training and testing.

Gaussian process regression

Gaussian process regression was used to estimate PWV based on the key features from the radial pressure wave identified by LASSO regression. The advantages of using Gaussian process regression are i) it can provide uncertainty of the estimation, which most machine learning regression methods are not able to; and ii) the hyperparameters in the model can be identified by maximising the log likelihood, which is less time consuming than cross-validation. The GaussianProcessRegressor library and kernel functions from the scikit-learn package were used to perform Gaussian process regression in Python. Three kernel functions: radial basis function (RBF), Matérn kernel with ν = 5/2, rational quadratic kernel, and their sum combinations were tested (results shown in S1 Fig). Finally, the rational quadratic kernel was chosen for this study based on the accuracy of its estimation.

Other machine learning methods

To confirm the accuracy of the PWV estimation by Gaussian process regression, three other machine learning methods were also used to estimate the PWV: support vector regression (SVR), and two tree-based methods (i.e. random forest regression and gradient boosting regression). All machine learning algorithms were performed using the libraries from the scikit-learn package. The hyperparameters in the SVR were tuned by 5 fold cross-validation with 10 iterations using the optunity package. The hyperparameters in the tree-based methods were tuned by 10 fold cross-validation with 100 iterations using random search from the scikit-learn package. In addition, apart from the tree-based methods, the features from the radial pressure wave were normalised using the StandardScaler library in the scikit-learn package. The training and testing/developing data ratio for all machine learning analyses was 7:3.

Error evaluation

The root mean square error (RMSE) was calculated to evaluate each machine learning approach, which is defined as,

RMSE=i=1n(PWVi^-PWVi)2n, (1)

where n is the size of the test dataset; PWVi^ and PWVi are the ith estimated and measured PWV, respectively. Then, a percentage error, ϵ, was calculated based on the RMSE:

ϵ=RMSEPWV¯×100%, (2)

where PWV¯ is the mean value of the PWV of the study population.

Results

The features from the radial pressure wave were reduced from 29 to 17 after performing the LASSO regression. The fiducial points containing those key features are shown in Fig 1a. Then, PCA was performed on the subjects using only those key features (Fig 1b). The results show that 3 of the 3,082 subjects were outliers.

Fig 1. Data pre-processing for pulse wave velocity estimation from the features extracted from the radial pressure wave.

Fig 1

(a) The fiducial points containing key features identified by the LASSO regression. (b) Identified outliers in the database using principal component analysis (PCA). Red, blue and green dots represent subject groups with pulse wave velocity (PWV) less than 7 m/s, 7–9 m/s, and greater than 9 m/s, respectively.

The Gaussian process regression was performed on the study population without the outliers (3,079 data samples). The model was trained on 2,155 data samples. The estimation results and errors when testing on the hold-out test data set containing 924 samples are shown in Fig 2a and 2c, and Table 2, respectively. Fig 2a shows a linear relationship between the estimated and measured PWV, with a slope of 1.00 and an offset of 0.24 m/s. The coefficient of determination, r2 equals to 0.42, and the p-value is less than 0.0001. The Bland-Altman plot shows a mean difference of 0.2 m/s, and upper and lower limits of agreement (LOA) of 3.75 m/s & -3.34 m/s, respectively. (Fig 2c). Both plots suggest that the accuracy of the PWV estimates deteriorated as the value of PWV increased. Table 2 illustrates that PWV could be estimated from the radial pulse waveform with an RMSE of 1.82 m/s and a percentage error, ϵ, of 19.4% over the whole test data set. In addition, Gaussian process regression can also provide a statistically meaningful range (95% confidence interval) that shows the reliability of the estimation (S2 Fig).

Fig 2. Estimation of pulse wave velocity (PWV) on a hold-out test set containing 924 subjects using Gaussian process regression and recurrent neural network with long short-term memory.

Fig 2

Panels (a) and (b) show estimated PWV against measured PWV with the linear regression line in red, the coefficient of determination, r2, and the p-value. Panels (c) and (d) show the Bland-Altman plots comparing the estimated and measured PWV. Panels (e) and (f) show Pearson correlation coefficients (r) between the biological characteristics of the cohort and the “Difference” values shown on panels (c) and (d), respectively. BMI: body mass index; DBP: diastolic blood pressure; SBP: systolic blood pressure; MAP: mean arterial pressure.

Table 2. Root mean square error (RMSE) and percentage error (ϵ) on the estimated pulse wave velocity (PWV) using different machine learning methods.

RMSE (m/s) ϵ (%)
Gaussian Process Regression 1.82 19.4
Support Vector Regression 1.74 18.5
Random Forest Regression 1.64 17.4
Gradient Boosting Regression 1.63 17.4
RNN 1.59 16.9

To confirm the accuracy of the estimation made by Gaussian process regression, three other machine learning methods were applied to the same training and hold-out testing data set to estimate PWV. Table 2 shows the error evaluations of all these methods. The other three machine learning methods can provide a PWV estimation with smaller errors than Gaussian process regression, with gradient boosting regression achieving the lowest RMSE (1.63 m/s) and ϵ (17.4%). However, the reduction of the errors was limited (less than 0.2 m/s for RMSE, and less than 2% for ϵ). Moreover, these alternative methods can not provide reliability of the PWV estimation (i.e. 95% confidence interval), and take longer to train (≤ 1 minute vs ≥ 30 minutes). In addition, the measured PWV ploted against estimated PWV and Bland-Altman plots simulated using these three algorithms can be found in S3 Fig.

The Pearson’s correlation coefficient, r, was used to investigate if the accuracy of the estimations using Gaussian process regression could be related to the biological characteristics. The following biological characteristics were studied: height, weight, body mass index (BMI), age (chronological age), diastolic blood pressure (DBP), systolic blood pressure (SBP), and mean arterial pressure (MAP). Fig 2e shows that the difference (between the estimated and measured PWV) correlates with the age the most, r = 0.286.

Case Study 2: PWV estimation from the entire radial pressure wave

Methods

The study population in Case Study 2 is identical to the study population in Case Study 1. The same error evaluation metrics were used to assess the accuracy of PWV estimation. This case study used a RNN model which is described next.

Recurrent neural network

The schematics of the RNN structure used in this Case Study is shown in Fig 3. The input data was an array of pressure values describing the radial pressure waveform at discrete time points. As the cardiac cycle of different subjects varied, the time duration of the radial pressure wave also differed from subject to subject. To overcome the length difference in the input data, the waves with shorter durations were extended to the duration of the longest wave by filling the array with dummy values (maximum floating point number in this case) at the end. Then, a masking layer was applied to exclude the dummy values from being considered when estimating PWV. Afterwards, a bidirectional RNN with LSTM was used to process the time-variant radial pressure waveform, as it has been proven effective in forecasting time series data [2426]. Finally, a dense layer with a linear activation function was used to estimate PWV based on the results from the bidirectional RNN with LSTM. Before carrying out the main simulation, hyperparameter tuning was undertaken and the following parameters were chosen: number of units for LSTM = 16; batch size = 64; epoch number = 1,500; optimizer = Adam. The RNN was constructed using open-source neural-network library TensorFlow Core v.2.2.0, including a high-level application programming interface Keras. The training and testing/developing data ratio for the RNN was also 7:3.

Fig 3. Schematic illustration of the recurrent neural network structure used to estimate pulse wave velocity from the entire radial pressure wave.

Fig 3

Pt−1, Pt and Pt+1 are the radial pressure values at the discrete time points t − 1, t, and t+1, cfPWV is the carotid-femoral pulse wave velocity.

Results

The RNNs with LSTM were trained and tested on the same datasets as the one used in Case Study 1. Fig 2b and 2d show the performance of the RNN. In comparison with the PWV estimation using Gaussian process regression, the RNN led to a smaller offset on the regression line (0.02 m/s vs 0.24 m/s) and a stronger correlation (r2: 0.49 vs 0.42). The Bland-Altman plots show that both mean difference and the upper and lower LOA are smaller than the corresponding values obtained by Gaussian process regression (Fig 2c and 2d) (0.05 m/s vs 0.2 m/s; 3.21 m/s & -3.11 m/s vs 3.75 m/s & -3.34 m/s). The RMSE and percentage error, ϵ, of PWV estimates using the RNN were similar to those obtained by the other machine learning methods used in Case Study 1 (see Table 2). Furthermore, Pearson’s correlation coefficients, r, between biological characteristics and the difference of measured and estimated PWV calculated for the RNN model (Fig 2f) were similar to the ones obtained using Gaussian process regression, with the age again showing the strongest correlation, r = 0.297.

Case Study 3: PWV estimation from the entire radial pressure wave with added random noise

Methods

This case study used the RNN model described in Case Study 2, with the same training and testing/developing data ratio. The same error evaluation metrics as in Case Studies 1 and 2 were used. The following two subsections describe the student population and random noise generation.

Study population

To systematically investigate the effects of high-frequency noise on the radial pressure wave, a database containing 4,374 virtual subjects representative of a sample of “healthy” adults aged between 25 and 75 years old in ten-year increments was used as the study population. The database can be downloaded from the following depository: https://github.com/peterhcharlton/pwdb/wiki/Using-the-Pulse-Wave-Database. The data used in this case study were the radial pressure waves and cfPWV. Further details of this database can be found in a previous study [22]. The rational behind choosing a database of virtual subjects was to eliminate the possible effects of measurement errors.

Noise generation

Different intensities of high-frequency Gaussian white noise were generated and added to the radial pressure waves to test the noise sensitivity of the PWV estimation by RNN. The intensity of the noise was defined using the signal to noise ratio (SNR), similar to the approach in [27], for which the SNR was calculated as,

SNR=PsignalPnoise, (3)

where Psignal and Pnoise are the power (averaged amplitude) of the pressure signal and Gaussian white noise, respectively. Six different SNRs were considered: 20, 16, 12, 10, 8 and 5. Fig 4 shows the effect of SNRs of 20, 10 and 5 on the original pressure signal.

Fig 4. An example of an original signal, and the same signal with added white noise, with signal to noise ratios (SNR) of 20, 10 and 5.

Fig 4

Results

The radial pressure waves from the database of virtual subjects augmented with different levels of random Gaussian white noise were used to test the noise sensitivity of the PWV estimation produced by the RNN model. The measured PWV plot against estimated PWV and Bland-Altman plots of the estimations from the original radial pressure wave and with SNRs of 20, 10 and 5 are shown in Fig 5. The coefficient of determination, r2 for all cases considered were ≥ 0.98. The mean difference did not increase, but the upper and lower LOA increased from 0.14 m/s & -0.24 m/s to 0.5 m/s & -0.56 m/s when adding 20% noise to the original radial pressure wave (SNR = 5). The RMSE increased from 0.10 m/s to 0.24 m/s, and the percentage error, ϵ, increased from 1.2% to 2.8%, when adding 20% noise to the original radial pressure wave (Table 3). Besides, the errors of the PWV estimates using waveforms without added noise from the database of virtual subjects were over 10 times smaller than those obtained from the Twins UK cohort using the same RNN model (Table 2).

Fig 5. Estimation of PWV on a hold-out test set containing 1312 virtual subjects using the recurrent neural network, with different levels of added white noise.

Fig 5

Estimated against measured PWV with the linear regression line in red, the coefficient of determination, r2, and the p-value (top). Corresponding Bland-Altman plots (bottom). SNR: signal to noise ratio.

Table 3. Root mean square error (RMSE) and percentage error (ϵ) for the pulse wave velocity (PWV) estimation from the radial pressure wave by the recurrent neural network (RNN), with different intensities of added white noise.

RMSE (m/s) ϵ (%)
Baseline 0.10 1.2
SNR = 20 0.15 1.8
SNR = 16 0.16 1.9
SNR = 12 0.16 1.9
SNR = 10 0.20 2.4
SNR = 8 0.21 2.5
SNR = 5 0.24 2.8

Discussion

We have shown the feasibility of estimating PWV from the radial pressure wave using (i) Gaussian process regression applied to features extracted from the waveform and (ii) a RNN model applied to the entire waveform. The results show that the PWV can be estimated from both pipelines, with the second pipeline presenting a slightly higher accuracy and a lower bias in the estimated PWV. However, the improvement in accuracy for PWV estimation from the second pipeline was limited, which indicated that the features extracted from the radial pressure wave using the pulse wave analyser developed by Charlton et al. [22] may be sufficient to describe the morphology of the entire radial pressure wave. Some of the key features identified by LASSO regression and applied to the PWV estimation using Gaussian process regression have been used to calculate pulse wave indices that are closely related to vascular ageing [2830]. For instance, the reflection index can be calculated from the feature ‘dia’; the augmentation index and augmentation pressure can be calculated from the features ‘p1in’ and ‘p2pk’; and the modified ageing index is related to the features ‘a’, ‘b’, and ‘c’ calculated from the second derivative of the waveform. Besides, Gaussian process regression can provide a statistically meaningful range (95% confidence interval) that shows the reliability of the estimation, and required less time to train (less than a minute using the data from the Twins UK cohort). On the other hand, in order to use the pulse wave analyser to extract features from the wave, the wave needs to be preprocessed to eliminate high and low frequency noises. This step, which can result in losses of information is not required by the proposed RNN model, even when using noisy pressure waves.

Comparing our resluts with those obtained by using other non-invasive devices (e.g. the Pulse Pen [31]) and measurement methods (e.g. the oscillometric method [32]) that require pulse wave measurements in two arterial measurement sites, the mean differences between the estimated and measured PWV were similar or smaller (≤ 0.214 m/s for Pulse Pen, 0.4 m/s for oscillometric method, vs ≤ 0.2 m/s in this study). The upper and lower LOA, however, were larger in this study (≤ 1.346 m/s & ≥ -0.918 m/s for Pulse Pen, ≤2.9 m/s & ≥ -2.0 m/s for oscillometric method, vs ≤ 3.75 m/s & ≥-3.34 m/s in this study). When comparing our results to those obtained by using a non-invasive device that only requires a single pulse wave measurement (e.g. the Arteriograph [33]), the mean difference was the same for the estimation using Gaussian process regression (= 0.2 m/s), and the upper and lower LOA were smaller in this study (≤ 4.5 m/s & ≥ -4.01 m/s vs ≤ 3.75 m/s & ≥ -3.34 m/s). The root mean square error (RMSE) of our PWV estimation was larger than that obtained in the machine learning study by Tavallali et al. [18] (RMSE = 1.1244 m/s). This may be explained by the fact that the average PWV in Tavallali et al.’s study was smaller than in this study, and that less patient information (e.g. chronological age) and neither the information from central arteries (e.g. carotid artery) were used in this study.

Based on the ARTERY Society guidelines for validation of non-invasive haemodynamic measurement devices [34], the mean differences obtained by the proposed algorithms are both “excellent”, whereas the “poor” standard deviations are due to the lack of data for subjects with high PWV in the Twins UK cohort. We now discuss possible causes that led to the PWV estimate errors in our study. Firstly, the reference PWV measurements may have been inaccurate. Previous studies [35, 36] have pointed out that the accuracy of the PWV measurement can be largely affected by inaccuracies in the distance between the carotid and femoral arteries, which is measured on the patients’ body surface by tape when using the SphygmoCor CvMS device. A further study showed that the accuracy of PWV measured by the SphygmoCor device decreased at higher PWV values. A possible explanation could be the larger variability of measured pulse wave transit time compared with other methods [37]. Higher PWV values are associated with small transit times, making the PWV values more sensitive to the variability in the transit time (which appears in the denominator of the PWV calculation). The RMSE and percentage error for the PWV estimates by the RNA model applied to the database of virtual subjects with noise-free data were considerably smaller (0.10 m/s vs 1.59 m/s and 1.2% vs 16.9%). This suggests the existence of measurement errors in the reference PWV values from the Twins UK cohort. However, further investigations on the accuracy of the PWV measurement would be needed to test this hypothesis. Secondly, the errors of the PWV estimates increased with the increasing PWV values, which could be due to the low number of high PWV samples in the dataset. It is known that the accuracy of machine learning algorithms decreases with the decreasing sample size [38]. Two experiments were carried out to confirm this. First, we increased the training dataset in Case Study 1 with high PWV values by resampling the original training dataset with replacement (S4a Fig). As shown in S4b–S4f Fig, this experiment reduced the bias in the estimation for high PWV values to some extent. However, the estimation accuracy (upper and lower LOA) did not improve, since no new information was added to the training process. In the second experiment, we reshuffled the whole dataset from Case Study 1 and split the training and testing datasets with an increased number of subjects with high PWV in the training dataset. This modification improved the estimation accuracy, which brought the standard deviation produced by the RNN model to the “acceptable” level according to the ARTERY Society guidelines [34] (S5 Fig). Therefore, both the bias and the accuracy of the estimation could be improved by training the algorithms with a training database containing more subjects with high PWV values. Lastly, the errors in the PWV estimation could also be the result of confounding biological characteristics of the patients, as the radial pressure wave was the only input used in our estimation pipelines. The Pearson’s correlation coefficients, r, between those biological characteristics and the difference of the estimated and measured PWV indicated that the chronological age was associated with the estimation error the most. However, this was expected since PWV has a positive correlation with chronological age and, as pointed out previously, the PWV estimation accuracy worsened for subjects with higher PWV values due to low sample numbers in the training datasets. Nevertheless, Pearson’s correlation coefficients in both machine learning approaches were smaller than 0.3, indicating a neglegible linear correlation [39]. Thus, the analysis suggested that the errors in the estimations would not be largely dependent on the biological characteristics.

This study is also subject to a few limitations and requires further work. Firstly, the majority of participants in the Twins UK cohort are females, which means the trained model in this study is less likely to fit well when using unseen data from a wider population. However, this should not affect the accuracy of the estimation within the analysis performed in this study and the conclusions. Secondly, the peripheral pulse wave used in this study was the radial pressure wave measured by applanation tonometry. Further studies using peripheral pulse waves, such as the PPG signal measured at the digital artery using a fingertip probe or smart phone camera, or the PPG signal measured around the wrist using the Apple Watch or Fitbit would be needed to further test the pipelines proposed in this study. Lastly, the pulse wave data in this study only contained a single cardiac cycle. Further investigations will be needed to assess the effectiveness of the RNN model on estimating cardiovascular indices using a pulse wave containing multiple cardiac cycles. The SyphygmoCor and a wearable devices such as the Apple Watch can acquire pulse wave signals over multiple cardiac cycles.

The clinical significance of this study aligns with assessing the risk factors for CVD from more accessible measurements. Firstly, the only input information to the proposed algorithms is the radial pressure wave, which is a peripheral pulse wave that can be easily measured via non-invasive devices. Importantly, this also makes the PWV estimation in this study totally independent of chronological age, which has been taken as input in other studies [18]. As chronological age does not necessarily correspond to the biological age [40], adding age as a predictor to the algorithm could also bias the estimation results. Estimating PWV without including chronological age also makes the prediction from the proposed algorithms in this study more robust and adequate for assessing vascular ageing. Secondly, the machine learning pipelines proposed in this study can also take other peripheral pulse waves, such as PPG signals, even the single lead ECG signals with more than one cardiac cycle as input to estimate CVD risks. Thirdly, the machine learning pipelines proposed in this study can be easily extended to take multiple peripheral pulse waves as input to further improve the accuracy of estimation for CVD risks.

Conclusion

Three case studies have been carried out to investigate the possibility of estimating PWV (a well-established biomarker) from the radial pressure wave (a peripheral pulse wave) using machine learning methods. Results have shown that PWV can be estimated either from the features extracted from the pulse wave (mean difference = 0.2 m/s, upper LOA = 3.75 m/s, lower LOA = -3.34 m/s) or the entire waveform (mean difference = 0.05 m/s, upper LOA = 3.21 m/s, lower LOA = -3.11 m/s) using a clinical database (Twins UK cohort). They also suggested that the estimation of the PWV from the entire radial pressure wave using a RNN model can still be achieved when up to 20% noise is added to the wave signal using a database of virtual subjects. However, the proposed methods need to be tested for reproducibility using independent external samples. Still, the outcome of this study can potentially help deliver vascular ageing assessment to a wider population and enable repetitive measurements that could improve the accuracy of the assessment. Further application of the machine learning pipelines proposed in this study would also help with remote patient monitoring and connected health. Additionally, the scripts for the machine learning pipelines proposed in this study are also available on the following online depository: https://github.com/WeiweiJin/Estimate-Cardiovascular-Risk-from-Pulse-Wave-Signal.

Supporting information

S1 Fig. Estimation of pulse wave velocity (PWV) using Gaussian process regression with different kernel functions and their sum combinations.

RBF: radial basis function; Matérn: Matérn kernel; RQ: rational quadratic kernel.

(TIF)

S2 Fig. Estimation of pulse wave velocity (PWV) with a 95% confidence interval using Gaussian process regression on a hold-out test set containing 924 subjects.

Panel (a) shows the measured and estimated PWV plot on top of each other; panel (b) shows the first ten samples in panel (a).

(TIF)

S3 Fig. Comparison of measured and estimated pulse wave velocity (PWV) and Bland-Altman plots using support vector regression, random forest regression and gradient boosting regression on a hold-out test set containing 924 subjects.

(TIF)

S4 Fig. Original training and testing data and resampled training data distribution using the Twins UK cohort data (a) and Bland-Altman plots for a hold-out test set containing 924 subjects with algorithms trained using resampled training data (b-f).

(TIF)

S5 Fig. Resampled training and testing data distribution using the Twins UK cohort data (a) and Bland-Altman plots for a resampled hold-out test set containing 924 subjects with algorithms trained using resampled training data (b-f).

(TIF)

Acknowledgments

The authors would like to thank Dr James R. Bland for discussions, especially during the methodology development.

Data Availability

Most of the measurement data from the clinical study population, the Twins UK cohort is available for external researchers via an application (https://twinsuk.ac.uk). As the authors of this paper have not participated in the Twins UK study, we do not have the rights to share their data. The database of virtual subjects can be found in the following depository: https://github.com/peterhcharlton/pwdb/wiki/Using-the-Pulse-Wave-Database. The scripts for the machine learning pipelines proposed in this study are also available on the following online depository: https://github.com/WeiweiJin/Estimate-Cardiovascular-Risk-from-Pulse-Wave-Signal.

Funding Statement

This work was supported by the British Heart Foundation (BHF) [PG/15/104/31913], the EPSRC [EP/K031546/1], the Wellcome/Engineering Physical Sciences Research Council (EPSRC) Centre for Medical Engineering at King’s College London [WT 203148/Z/16/Z], the Department of Health through the National Institute for Health Research (NIHR) Cardiovascular MedTech Co-operative at Guy’s and St Thomas’ NHS Foundation Trust (GSTT), the comprehensive Biomedical Research Centre and Clinical Research Facilities awards to Guy’s and St Thomas’ NHS Foundation Trust in partnership with King’s College London and King’s College Hospital NHS Foundation Trust, and the Ministry of Science and Higher Education of the Russian Federation within the framework of state support for the creation and development of World-Class Research Centers “Digital biodesign and personalized healthcare” [075-15-2020-926]. The views expressed are those of the authors and not necessarily those of the BHF, Wellcome Trust, EPSRC, NIHR, GSTT or Ministry of Science. WJ was funded by a King’s College London PGR International Scholarship.

References

  • 1. Laina A, Stellos K, Stamatelopoulos K. Vascular ageing: Underlying mechanisms and clinical implications. Experimental Gerontology. 2018;109:16–30. doi: 10.1016/j.exger.2017.06.007 [DOI] [PubMed] [Google Scholar]
  • 2. North BJ, Sinclair DA. The intersection between aging and cardiovascular disease. Circulation Research. 2012;110(8):1097–1108. doi: 10.1161/CIRCRESAHA.111.246876 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Nilsson PM, Boutouyrie P, Laurent S. Vascular aging: A tale of EVA and ADAM in cardiovascular risk assessment and prevention. Hypertension. 2009;54(1):3–10. doi: 10.1161/HYPERTENSIONAHA.109.129114 [DOI] [PubMed] [Google Scholar]
  • 4. Gomez-Sanchez M, Gomez-Sanchez L, Patino-Alonso MC, Cunha PG, Recio-Rodriguez JI, Alonso-Dominguez R, et al. Vascular aging and its relationship with lifestyles and other risk factors in the general Spanish population: Early Vascular Ageing Study. Journal of hypertension. 2020;38(6):1110–1122. doi: 10.1097/HJH.0000000000002373 [DOI] [PubMed] [Google Scholar]
  • 5. Niebauer J, Müller EE, Schönfelder M, Schwarzl C, Mayr B, Stöggl J, et al. Acute effects of winter sports and indoor cycling on arterial stiffness. Journal of Sports Science and Medicine. 2020;19(3):460–468. [PMC free article] [PubMed] [Google Scholar]
  • 6. Nilsson PM, Lurbe E, Laurent S. The early life origins of vascular ageing and cardiovascular risk: The EVA syndrome. Journal of Hypertension. 2008;26(6):1049–1057. doi: 10.1097/HJH.0b013e3282f82c3e [DOI] [PubMed] [Google Scholar]
  • 7. Laurent S, Boutouyrie P, Cunha PG, Lacolley P, Nilsson PM. Concept of extremes in vascular aging: From early vascular aging to supernormal vascular aging. Hypertension. 2019;74(2):218–228. doi: 10.1161/HYPERTENSIONAHA.119.12655 [DOI] [PubMed] [Google Scholar]
  • 8. Vlachopoulos C, Aznaouridis K, Stefanadis C. Prediction of cardiovascular events and all-cause mortality with arterial stiffness. A systematic review and meta-analysis. Journal of the American College of Cardiology. 2010;55(13):1318–1327. doi: 10.1016/j.jacc.2009.10.061 [DOI] [PubMed] [Google Scholar]
  • 9. Van Bortel LM, Laurent S, Boutouyrie P, Chowienczyk P, Cruickshank JK, De Backer T, et al. Expert consensus document on the measurement of aortic stiffness in daily practice using carotid-femoral pulse wave velocity. Journal of Hypertension. 2012;30(3):445–448. doi: 10.1097/HJH.0b013e32834fa8b0 [DOI] [PubMed] [Google Scholar]
  • 10. Laurent S, Cockcroft J, Van Bortel L, Boutouyrie P, Giannattasio C, Hayoz D, et al. Expert consensus document on arterial stiffness: Methodological issues and clinical applications. European Heart Journal. 2006;27(21):2588–2605. doi: 10.1093/eurheartj/ehl254 [DOI] [PubMed] [Google Scholar]
  • 11. Alqudah AM, Albadarneh A, Abu-Qasmieh I, Alquran H. Developing of robust and high accurate ECG beat classification by combining Gaussian mixtures and wavelets features. Australasian Physical and Engineering Sciences in Medicine. 2019;42(1):149–157. doi: 10.1007/s13246-019-00722-z [DOI] [PubMed] [Google Scholar]
  • 12. Attia ZI, Kapa S, Lopez-Jimenez F, McKie PM, Ladewig DJ, Satam G, et al. Screening for cardiac contractile dysfunction using an artificial intelligence–enabled electrocardiogram. Nature Medicine. 2019;25(1):70–74. doi: 10.1038/s41591-018-0240-2 [DOI] [PubMed] [Google Scholar]
  • 13. Biswas D, Everson L, Liu M, Panwar M, Verhoef BE, Patki S, et al. CorNET: Deep learning framework for PPG-based heart rate estimation and biometric identification in ambulant environment. IEEE Transactions on Biomedical Circuits and Systems. 2019;13(2):282–291. doi: 10.1109/TBCAS.2019.2892297 [DOI] [PubMed] [Google Scholar]
  • 14. Awan SE, Bennamoun M, Sohel F, Sanfilippo FM, Dwivedi G. Machine learning-based prediction of heart failure readmission or death: implications of choosing the right model and the right metrics. ESC Heart Failure. 2019;6(2):428–435. doi: 10.1002/ehf2.12419 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Cikes M, Sanchez-Martinez S, Claggett B, Duchateau N, Piella G, Butakoff C, et al. Machine learning-based phenogrouping in heart failure to identify responders to cardiac resynchronization therapy. European Journal of Heart Failure. 2019;21(1):74–85. doi: 10.1002/ejhf.1333 [DOI] [PubMed] [Google Scholar]
  • 16. Karunathilake SP, Ganegoda GU. Secondary prevention of cardiovascular diseases and application of technology for early diagnosis. BioMed Research International. 2018;2018. doi: 10.1155/2018/5767864 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Chakshu NK, Sazonov I, Nithiarasu P. Towards enabling a cardiovascular digital twin for human systemic circulation using inverse analysis. Biomechanics and Modeling in Mechanobiology. 2020; doi: 10.1007/s10237-020-01393-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Tavallali P, Razavi M, Pahlevan NM. Artificial intelligence estimation of carotid-femoral pulse wave velocity using carotid waveform. Scientific Reports. 2018;8(1):1–12. doi: 10.1038/s41598-018-19457-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Moayyeri A, Hammond CJ, Hart DJ, Spector TD. The UK adult twin registry (TwinsUK resource). Twin Research and Human Genetics. 2013;16(1):144–149. doi: 10.1017/thg.2012.89 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Moayyeri A, Hammond CJ, Valdes AM, Spector TD. Cohort profile: TwinsUK and healthy ageing twin study. International Journal of Epidemiology. 2013;42(1):76–85. doi: 10.1093/ije/dyr207 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Charlton PH, Celka P, Farukh B, Chowienczyk P, Alastruey J. Assessing mental stress from the photoplethysmogram: A numerical study. Physiological Measurement. 2018;39(5). doi: 10.1088/1361-6579/aabe6a [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Charlton PH, Mariscal Harana J, Vennin S, Li Y, Chowienczyk P, Alastruey J. Modeling arterial pulse waves in healthy aging: a database for in silico evaluation of hemodynamics and pulse wave indexes. American journal of physiology Heart and circulatory physiology. 2019;317(5):H1062–H1085. doi: 10.1152/ajpheart.00218.2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Perez H, Tah JHM. Improving the accuracy of convolutional neural networks by identifying and removing outlier images in datasets using t-SNE. Mathematics. 2020;8(5). doi: 10.3390/math8050662 [DOI] [Google Scholar]
  • 24. Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation. 1997;9:1735–1780. doi: 10.1162/neco.1997.9.8.1735 [DOI] [PubMed] [Google Scholar]
  • 25. Lipton ZC, Berkowitz J, Elkan C. A critical review of recurrent neural networks for sequence learning. arXiv. 2015; p. 1–38. [Google Scholar]
  • 26. Schmidhuber J. Deep learning in neural networks: An overview. Neural Networks. 2015;61:85–117. doi: 10.1016/j.neunet.2014.09.003 [DOI] [PubMed] [Google Scholar]
  • 27. Gaddum NR, Alastruey J, Beerbaum P, Chowienczyk P, Schaeffter T. A technical assessment of pulse wave velocity algorithms applied to non-invasive arterial waveforms. Annals of Biomedical Engineering. 2013;41(12):2617–2629. doi: 10.1007/s10439-013-0854-y [DOI] [PubMed] [Google Scholar]
  • 28. Mikael LdR, de Paiva AMG, Gomes MM, Sousa ALL, Jardim PCBV, Vitorino PVdO, et al. Vascular ageing and arterial stiffness. Arquivos Brasileiros de Cardiologia. 2017;109(3):253–258. doi: 10.5935/abc.20170091 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Mitchell GF, Parise H, Benjamin EJ, Larson MG, Keyes MJ, Vita JA, et al. Changes in arterial stiffness and wave reflection with advancing age in healthy men and women: The Framingham Heart Study. Hypertension. 2004;43(6):1239–1245. doi: 10.1161/01.HYP.0000128420.01881.aa [DOI] [PubMed] [Google Scholar]
  • 30. Wang KL, Cheng HM, Sung SH, Chuang SY, Li CH, Spurgon HA, et al. Wave reflection and arterial stiffness in the prediction of 15-year all-cause and cardiovascular mortalities: A community-based study. Hypertension. 2010;55(3):799–805. doi: 10.1161/HYPERTENSIONAHA.109.139964 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Salvi P, Lio G, Labat C, Ricci E, Pannier B, Benetos A. Validation of a new non-invasive portable tonometer for determining arterial pressure wave and pulse wave velocity: The PulsePen device. Journal of Hypertension. 2004;22(12):2285–2293. doi: 10.1097/00004872-200412000-00010 [DOI] [PubMed] [Google Scholar]
  • 32. Hametner B, Wassertheurer S, Kropf J, Mayer C, Eber B, Weber T. Oscillometric estimation of aortic pulse wave velocity: Comparison with intra-aortic catheter measurements. Blood Pressure Monitoring. 2013;18(3):173–176. doi: 10.1097/MBP.0b013e3283614168 [DOI] [PubMed] [Google Scholar]
  • 33. Jekell A, Kahan T. The usefulness of a single arm cuff oscillometric method (Arteriograph) to assess changes in central aortic blood pressure and arterial stiffness by antihypertensive treatment: results from the Doxazosin-Ramipril Study. Blood Pressure. 2018;27(2):88–98. doi: 10.1080/08037051.2017.1394791 [DOI] [PubMed] [Google Scholar]
  • 34. Wilkinson IB, McEniery CM, Schillaci G, Boutouyrie P, Segers P, Donald A, et al. ARTERY Society guidelines for validation of non-invasive haemodynamic measurement devices: Part 1, arterial pulse wave velocity. Artery Research. 2010;4(2):34–40. doi: 10.1016/j.artres.2010.03.001 [DOI] [Google Scholar]
  • 35. Segers P, Kips J, Trachet B, Swillens A, Vermeersch S, Mahieu D, et al. Limitations and pitfalls of non-invasive measurement of arterial pressure wave reflections and pulse wave velocity. Artery Research. 2009;3(2):79–88. doi: 10.1016/j.artres.2009.02.006 [DOI] [Google Scholar]
  • 36. Weir-McCall JR, Khan F, Cassidy DB, Thakur A, Summersgill J, Matthew SZ, et al. Effects of inaccuracies in arterial path length measurement on differences in MRI and tonometry measured pulse wave velocity. BMC Cardiovascular Disorders. 2017;17(1):1–9. doi: 10.1186/s12872-017-0546-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Grillo A, Parati G, Rovina M, Moretti F, Salvi L, Gao L, et al. Short-Term Repeatability of Noninvasive Aortic Pulse Wave Velocity Assessment: Comparison between Methods and Devices. American Journal of Hypertension. 2018;31(1):80–88. doi: 10.1093/ajh/hpx140 [DOI] [PubMed] [Google Scholar]
  • 38. Cui Z, Gong G. The effect of machine learning regression algorithms and sample size on individualized behavioral prediction with functional connectivity features. NeuroImage. 2018;178(May):622–637. doi: 10.1016/j.neuroimage.2018.06.001 [DOI] [PubMed] [Google Scholar]
  • 39. Mukaka MM. Statistics Corner: A guide to appropriate use of correlation coefficient in medical resaerch. Malawi Medical Journal. 2012;24(September):69–71. doi: 10.1016/j.cmpb.2016.01.020 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Shiels PG, McGuinness D, Eriksson M, Kooman JP, Stenvinkel P. The role of epigenetics in renal ageing. Nature Reviews Nephrology. 2017;13(8):471–482. doi: 10.1038/nrneph.2017.78 [DOI] [PubMed] [Google Scholar]

Decision Letter 0

Alberto Milan

25 Jan 2021

PONE-D-20-40380

Estimating pulse wave velocity from the radial pressure wave using machine learning algorithms

PLOS ONE

Dear Dr. Jin,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Mar 01 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Alberto Milan

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. We note that you have indicated that data from this study are available upon request. PLOS only allows data to be available upon request if there are legal or ethical restrictions on sharing data publicly. For information on unacceptable data access restrictions, please see http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions.

In your revised cover letter, please address the following prompts:

a) If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially identifying or sensitive patient information) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent.

b) If there are no restrictions, please upload the minimal anonymized data set necessary to replicate your study findings as either Supporting Information files or to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. Please see http://www.bmj.com/content/340/bmj.c181.long for guidelines on how to de-identify and prepare clinical data for publication. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories.

We will update your Data Availability statement on your behalf to reflect the information you provide.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: In this paper, the authors proposed two machine learning methods to estimate PWV from radial pressure wave obtained with arterial tonometry, by using Gaussian process regression from features extracted from the waveform and using recurrent neural network from the entire waveform. The use of random noises on the data did not change the accuracy of results obtained by RNN analysis.

The study has the merit to clearly present methods to provide an accessible PWV estimation by peripheral waveform, with some limitations.

These are my remarks:

- The proposed methods need to be tested for reproducibility and independent sample external validation. This should be clearly stated in the conclusions. The pulse wave data analysis was performed in a single cardiac cycle, while the reference method (Sphygmocor CvMS) for measurement of cfPWV require several cardiac cycles. Could the authors perform a repeatability analysis in a subgroup of subjects considering more than one cardiac cycle?

- Figure 2 and discussion: the heteroscedasticity in the distribution is attributed by the authors to general measurement errors. I partly agree with aurthors. A reduced reproducibility and thus a possible lower accuracy of cfPWV measurement was demonstrated for tonometers such a Sphygmocor at higher PWV values (see Grillo et al. Short-term repeatability of noninvasive aortic pulse wave velocity assessment: comparison between methods and devices. American journal of hypertension, 31(1), 80-88). This is an intrinsic characteristic of measurement and due to the fact that time measurement is placed at the denominator of calculation of PWV. Were the cfPWV measurements in Twins UK cohort performed twice as currently recommended?

- Figure 2 and discussion: the distribution in Bland-Altman plots look skewed for higher values. May this cause an underestimation of PWV by algorithms for higher PWV values?

Reviewer #2: This is an interesting and well written study. The issue is of high interest for scientists and clinicians. Results could inform future approaches to develop highly efficient tools aimed at facilitating the assessment of CV risk at the population level.

Mayor concerns

1- The authors stated that: “Both plots suggested that the accuracy of the PWV estimation deteriorated as the value of PWV increased”. At a visual inspection of the Bland-Altman plot, more that an increase in dispersion at increasing PWV values (heteroscedasticity), a systematic overestimation at increasing PWV value is found, suggesting systematic bias. This could be tested by appropriate statistics (e.g. correlation analysis). Please, check and modify the results accordingly.

2- The authors wrote that “Gaussian process regression can also provide a 95% confidence interval additional to the estimated PWV, which 98% of the measured PWV values were within the 95% confidence interval range”. A similar sentence is replicated also in the discussion (“Gaussian process regression was able to provide a 95% confidence interval for each estimation that covers at least 98% of the measured PWV”). I think that these sentences should be placed in the right context because they may generate a distorted perception of very high levels of accuracy of the estimated PWV approach.

I have some concerns in considering the fact that measured PWV falls within 95% of CI range is a measure of accuracy, because accuracy is usually described in terms of absolute SD values or rather as % of explained variance. I think that LOA of 3.21 m/s and -3.11 m/s, and 49% of variance explained suggest limited accuracy. The authors could also refer to Wilkinson IB et al, Artery Research 2010;4:34-40 and rephrase the sentence (especially in the discussion) accordingly.

3- The authors found that the correlation coefficient between age and the difference of the estimated and measured PWV was high, and they suggested that adding age as a predictor could potentially improve the estimation. I have a different explaination related to my point 1. If the difference between ePWV and mPWV increases at increasing PWV, and PWV increases with age, it is quite expected that this variable (difference) has a residual co-linearity with age. Do the authors agree? Rather, it is important to emphasize the fact that the process of PWV estimation is totally independent from chronological age (differently from other approaches).

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Andrea Grillo

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2021 Jun 28;16(6):e0245026. doi: 10.1371/journal.pone.0245026.r002

Author response to Decision Letter 0


5 Mar 2021

We thank the reviewers for their encouraging and constructive comments. In this document we provide a point-by-point response to the comments.

Reviewer #1

In this paper, the authors proposed two machine learning methods to estimate PWV from radial pressure wave obtained with arterial tonometry, by using Gaussian process regression from features extracted from the waveform and using recurrent neural network from the entire waveform. The use of random noises on the data did not change the accuracy of results obtained by RNN analysis.

The study has the merit to clearly present methods to provide an accessible PWV estimation by peripheral waveform, with some limitations.

These are my remarks:

- The proposed methods need to be tested for reproducibility and independent sample external validation. This should be clearly stated in the conclusions. The pulse wave data analysis was performed in a single cardiac cycle, while the reference method (Sphygmocor CvMS) for measurement of cfPWV require several cardiac cycles. Could the authors perform a repeatability analysis in a subgroup of subjects considering more than one cardiac cycle?

Thank you for the comment. We have added the following sentence to the Conclusion section to clarify that the proposed methods would need to be tested for reproducibility using independent data samples (page 10).

“However, the proposed methods need to be tested for reproducibility using independent external samples.”

This has also been clarified in the Discussion section with the following sentence (page 10):

“ Lastly, the pulse wave data in this study only contained a single cardiac cycle. Further investigations will be needed to assess the effectiveness of the RNN model on estimating cardiovascular indices using a pulse wave containing multiple cardiac cycles. The SyphygmoCor and a wearable device such as the Apple Watch can acquire pulse wave signals over multiple cardiac cycles.”

Unfortunately, we do not have the raw data from the SphygmoCor CvMS containing multiple cardiac cycles to use for testing the RNN model.

- Figure 2 and discussion: the heteroscedasticity in the distribution is attributed by the authors to general measurement errors. I partly agree with aurthors. A reduced reproducibility and thus a possible lower accuracy of cfPWV measurement was demonstrated for tonometers such a Sphygmocor at higher PWV values (see Grillo et al. Short-term repeatability of noninvasive aortic pulse wave velocity assessment: comparison between methods and devices. American journal of hypertension, 31(1), 80-88). This is an intrinsic characteristic of measurement and due to the fact that time measurement is placed at the denominator of calculation of PWV. Were the cfPWV measurements in Twins UK cohort performed twice as currently recommended?

Thank you for providing these references and explaining the possible error propagation in the PWV measurement by the SphygmoCor device. The following sentence was added to the Discussion section which contains the suggested reference (page 9):

“A further study showed that the accuracy of PWV measured by the SphygmoCor device decreased at higher PWV values. A possible explanation could be the larger variability of measured pulse wave transit time compared with other methods [1]. Higher PWV values are associated to a small transit time, making the PWV values more sensitive to the variability in the transit time (which appears in the denominator of the PWV calculation).”

And yes, the cfPWV measurements in Twins UK cohort was performed at least twice, as currently recommended.

- Figure 2 and discussion: the distribution in Bland-Altman plots look skewed for higher values. May this cause an underestimation of PWV by algorithms for higher PWV values?

Thank you for pointing this out. The answer is yes; the accuracy of the PWV estimates deteriorates for higher PWV values, and this is mainly due to the number of subjects with high PWV being far less than those with lower PWV. As machine learning algorithms are data driven, the scarceness of subjects with high PWV makes estimation for higher PWV values more challenging. To better illustrate this point, two additional experiments have been carried out, which involve i) increasing the training dataset from Case Study 1 (Twins UK cohort) by resampling the original training dataset with replacement, and ii) reshuffling the whole dataset from Case Study 1 and splitting the training and testing datasets with an increased number of high PWV subjects in the training dataset. The first experiment shows that increasing the training data number (weights) for higher PWV values can reduce the bias in the estimation for the testing dataset. The second experiment shows that increasing the number of high PWV values for the training dataset – while decreasing the number of high PWV for the testing dataset – can improve the accuracy of the estimation. These results have been added to the Supplemental Information along with the following paragraph, which has been added to the Discussion (page 9).

“Two experiments were carried out to confirm this. First, we increased the training dataset in Case Study 1 with high PWV values by resampling the original training dataset with replacement (S4 Fig a). As shown in S4 Fig b-f, this experiment reduced the bias in the estimation for high PWV values to some extent. However, the estimation accuracy (upper and lower LOA) did not improve, since no new information was added to the training process. In the second experiment, we reshuffled the whole dataset from Case Study 1 and split the training and testing datasets with an increased number of subjects with high PWV in the training dataset. This modification improved the estimation accuracy, which brought the standard deviation produced by the RNN model to the ”acceptable” level according to the ARTERY Society guidelines [2] (S5 Fig). Therefore, both the bias and the accuracy of the estimation could be improved by training the algorithms with a training database containing more subjects with high PWV values.”

Reviewer #2

This is an interesting and well written study. The issue is of high interest for scientists and clinicians. Results could inform future approaches to develop highly efficient tools aimed at facilitating the assessment of CV risk at the population level.

Major concerns

1- The authors stated that: “Both plots suggested that the accuracy of the PWV estimation deteriorated as the value of PWV increased”. At a visual inspection of the Bland-Altman plot, more that an increase in dispersion at increasing PWV values (heteroscedasticity), a systematic overestimation at increasing PWV value is found, suggesting systematic bias. This could be tested by appropriate statistics (e.g. correlation analysis). Please, check and modify the results accordingly.

Thank you for raising this point. The reason for the underestimation for higher PWV values is due to the number of subjects with high PWV being far less than those with lower PWV. As machine learning algorithms are data driven, the scarceness of subjects with high PWV makes estimation for higher PWV values more challenging. To better illustrate this point, two additional experiments have been carried out. These involved i) increasing the training dataset from Case Study 1 (Twins UK cohort) by resampling the original training dataset with replacement, and ii) reshuffling the whole dataset from Case Study 1 and splitting the training and testing datasets with an increased number of high PWV subjects in the training dataset. The first experiment shows that increasing the training data number (weights) for higher PWV values can reduce the bias in the estimation for the testing dataset. The second experiment shows that increasing the number of high PWV values for the training dataset – while decreasing the number of high PWV for the testing dataset – can improve the accuracy of the estimation. These results have been added to the Supplemental Information along with the following paragraph, which has been added to the Discussion (page 9).

“Two experiments were carried out to confirm this. First, we increased the training dataset in Case Study 1 with high PWV values by resampling the original training dataset with replacement (S4 Fig a). As shown in S4 Fig b-f, this experiment reduced the bias in the estimation for high PWV values to some extent. However, the estimation accuracy (upper and lower LOA) did not improve, since no new information was added to the training process. In the second experiment, we reshuffled the whole dataset from Case Study 1 and split the training and testing datasets with an increased number of subjects with high PWV in the training dataset. This modification improved the estimation accuracy, which brought the standard deviation produced by the RNN model to the ”acceptable” level according to the ARTERY Society guidelines [2] (S5 Fig). Therefore, both the bias and the accuracy of the estimation could be improved by training the algorithms with a training database containing more subjects with high PWV values.”

2- The authors wrote that “Gaussian process regression can also provide a 95% confidence interval additional to the estimated PWV, which 98% of the measured PWV values were within the 95% confidence interval range”. A similar sentence is replicated also in the discussion (“Gaussian process regression was able to provide a 95% confidence interval for each estimation that covers at least 98% of the measured PWV”). I think that these sentences should be placed in the right context because they may generate a distorted perception of very high levels of accuracy of the estimated PWV approach.

I have some concerns in considering the fact that measured PWV falls within 95% of CI range is a measure of accuracy, because accuracy is usually described in terms of absolute SD values or rather as % of explained variance. I think that LOA of 3.21 m/s and -3.11 m/s, and 49% of variance explained suggest limited accuracy. The authors could also refer to Wilkinson IB et al, Artery Research 2010;4:34-40 and rephrase the sentence (especially in the discussion) accordingly.

Thank you for pointing this out. We agree that using 95% confidence interval as a metric for estimation accuracy might not be appropriate here. We have deleted this sentence from the Abstract. However, the 95% confidence interval is a statistically meaningful range that shows the reliability of the estimation. Thus, the sentence regarding the confidence interval in the Discussion (page 8) and elsewhere (page 5) has been modified to the following sentence.

“Gaussian process regression can provide a statistically meaningful range (95% confidence interval) that shows the reliability of the estimation.”

With regards to the discussion on accuracy, the following sentence has been added to the Discussion which includes the suggested reference by Wilkinson IB et al. (page 9).

“Based on the ARTERY Society guidelines for validation of non-invasive haemodynamic measurement devices [2], the mean differences obtained by the proposed algorithms are both “excellent”, whereas the “poor” standard deviations are due to the lack of data for subjects with high PWV in the Twins UK cohort. We now discuss possible causes that led to the PWV estimate errors in our study.”

3- The authors found that the correlation coefficient between age and the difference of the estimated and measured PWV was high, and they suggested that adding age as a predictor could potentially improve the estimation. I have a different explaination related to my point 1. If the difference between ePWV and mPWV increases at increasing PWV, and PWV increases with age, it is quite expected that this variable (difference) has a residual co-linearity with age. Do the authors agree? Rather, it is important to emphasize the fact that the process of PWV estimation is totally independent from chronological age (differently from other approaches).

Thank you for your insights. Yes, we agree that PWV generally increases with age, and the difference between estimated and measured PWV increases with increasing PWV values and, thus, the chronological age. The sentence involving correlation coefficient in the Discussion has been modified as follows (page 9-10).

“The Pearson’s correlation coefficients, r, between those biological characteristics and the difference of the estimated and measured PWV indicated that the chronological age was associated with the estimation error the most. However, this was expected since PWV has a positive correlation with chronological age and, as pointed out previously, the PWV estimation accuracy worsened for subjects with higher PWV values due to low sample numbers in the training datasets.”

And yes, the point here is that the PWV estimation in this study is totally independent from chronological age, which differs from other approaches. The following sentence has also been added to the last paragraph in the Discussion to emphasise this (page 10).

“Importantly, this also makes the PWV estimation in this study totally independent of chronological age, which has been taken as input in other studies [3]. As chronological age does not necessarily correspond to the biological age [4], adding age as a predictor to the algorithm could also bias the estimation results. Estimating PWV without including chronological age also makes the prediction from the proposed algorithms in this study more robust and adequate for assessing vascular ageing.”

References

Repeatability of Noninvasive Aortic Pulse Wave Velocity Assessment: Comparison between Methods and Devices. Am J Hypertens. 2018;31: 80–88. doi:10.1093/ajh/hpx140

2. Wilkinson IB, McEniery CM, Schillaci G, Boutouyrie P, Segers P, Donald A, et al. ARTERY Society guidelines for validation of non-invasive haemodynamic measurement devices: Part 1, arterial pulse wave velocity. Artery Res. 2010;4: 34–40. doi:10.1016/j.artres.2010.03.001

3. Tavallali P, Razavi M, Pahlevan NM. Artificial intelligence estimation of carotid-femoral pulse wave velocity using carotid waveform. Sci Rep. 2018;8: 1–12. doi:10.1038/s41598-018-19457-0

4. Shiels PG, McGuinness D, Eriksson M, Kooman JP, Stenvinkel P. The role of epigenetics in renal ageing. Nat Rev Nephrol. 2017;13: 471–482. doi:10.1038/nrneph.2017.78

Attachment

Submitted filename: Response_to_Reviewers_Comments_V2.pdf

Decision Letter 1

Alberto Milan

2 Jun 2021

Estimating pulse wave velocity from the radial pressure wave using machine learning algorithms

PONE-D-20-40380R1

Dear Dr. Jin,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Alberto Milan

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The paper is significantly improved from the previous version. My previous remarks have been adequately addressed.

Reviewer #2: (No Response)

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Acceptance letter

Alberto Milan

17 Jun 2021

PONE-D-20-40380R1

Estimating pulse wave velocity from the radial pressure wave using machine learning algorithms

Dear Dr. Jin:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Alberto Milan

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Fig. Estimation of pulse wave velocity (PWV) using Gaussian process regression with different kernel functions and their sum combinations.

    RBF: radial basis function; Matérn: Matérn kernel; RQ: rational quadratic kernel.

    (TIF)

    S2 Fig. Estimation of pulse wave velocity (PWV) with a 95% confidence interval using Gaussian process regression on a hold-out test set containing 924 subjects.

    Panel (a) shows the measured and estimated PWV plot on top of each other; panel (b) shows the first ten samples in panel (a).

    (TIF)

    S3 Fig. Comparison of measured and estimated pulse wave velocity (PWV) and Bland-Altman plots using support vector regression, random forest regression and gradient boosting regression on a hold-out test set containing 924 subjects.

    (TIF)

    S4 Fig. Original training and testing data and resampled training data distribution using the Twins UK cohort data (a) and Bland-Altman plots for a hold-out test set containing 924 subjects with algorithms trained using resampled training data (b-f).

    (TIF)

    S5 Fig. Resampled training and testing data distribution using the Twins UK cohort data (a) and Bland-Altman plots for a resampled hold-out test set containing 924 subjects with algorithms trained using resampled training data (b-f).

    (TIF)

    Attachment

    Submitted filename: Response_to_Reviewers_Comments_V2.pdf

    Data Availability Statement

    Most of the measurement data from the clinical study population, the Twins UK cohort is available for external researchers via an application (https://twinsuk.ac.uk). As the authors of this paper have not participated in the Twins UK study, we do not have the rights to share their data. The database of virtual subjects can be found in the following depository: https://github.com/peterhcharlton/pwdb/wiki/Using-the-Pulse-Wave-Database. The scripts for the machine learning pipelines proposed in this study are also available on the following online depository: https://github.com/WeiweiJin/Estimate-Cardiovascular-Risk-from-Pulse-Wave-Signal.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES