Abstract
This study presents a non-contact approach to measuring heart rate and blood pressure using an image photoplethysmography (iPPG) signal, and compares the results to those from an oscillometric blood pressure meter. Facial videos of 100 subjects were recorded via a webcam under ambient lighting conditions to extract iPPG signals. The results revealed a strong correlation between the heart rate derived from iPPG and that obtained from an oscillometric blood pressure meter. In addition, a continuous wavelet transform images with a 6-s duration were used as input for a custom convolutional neural network model, providing the most accurate blood pressure estimation. The proposed method received a grade A for diastolic and grade B for systolic blood pressure based on the British Hypertension Society's criteria. It also met the standards set by the Association for the Advancement of Medical Instrumentation. This non-contact framework shows promising potential for efficient screening purposes.
Keywords: Non-contact, Heart rate, Blood pressure, Wavelet transform, Convolutional neural network
Highlights
-
•
Non-contact approach to measuring heart rate and blood pressure using an image photoplethysmography (iPPG) signal.
-
•
Direct estimation of blood pressure by utilizing the continuous Wavelet transform of the iPPG signal and a compact CNN model.
-
•
Heart rate and blood pressure can be accurately estimated using segment lengths of 30 s and 6 s, respectively.
-
•
The evaluation of the proposed method based on criteria set by the BIHS and the AAMI.
1. Introduction
In the outpatient department of a hospital, body temperature, heart rate, and blood pressure (BP) are taken and recorded as part of the pre-screening process. At the screening point, body temperature is commonly measured using a digital thermometer, while heart rate and BP are taken with a digital blood pressure meter. An oscillometric BP meter is widely used at screening points in hospitals, due to its convenience and automatic measurement. Since the outbreak of coronavirus disease 2019 (COVID-19), there has been a high demand for non-contact medical devices, to reduce the risk of infection. Although such non-contact devices for measuring body temperature are widely used for screening purposes, the measurement of heart rate and BP without the need for physical contact is still an area of ongoing research.
A photoplethysmography (PPG) signal is a non-invasive technique that has been widely employed in many studies to measure heart rate and BP. Since PPG measures the changes in the blood volume of the body over the cardiac cycle, the PPG signal also characterizes the systolic and diastolic processes of the heart, which are closely related to BP [1]. Measurements of PPG signals can be conducted via either a contact or non-contact process. In the case of contact PPG (cPPG), a photodetector is commonly used to measure changes in the light transmitted or reflected by human skin vessels [2], which are typically attached to a specific body part [3,4]. In contrast, non-contact or remote methods use a camera to measure the light intensity over a larger area, thus enabling the acquisition of iPPG [5,6].
Many researchers have utilized iPPG signals to develop non-contact methods to accurately measure heart rate [[7], [8], [9]] and BP [[10], [11], [12]]. There are two primary non-contact methods that are commonly employed. The first is known as pulse transit time (PTT), which refers to the time difference between two pulse waves within a cardiac cycle. This can be achieved by measuring the time difference between iPPG signals from different parts of the body [11,13,14]. The second method is based on the morphological theory of iPPG [1]. Although PTT is a parameter that is considered to be correlated with BP, obtaining an additional iPPG signal can incur extra time and cost. Hence, this study focuses on the second method, which relies only on a single iPPG signal.
Advancements in computational power and artificial intelligence technology have made machine learning a popular method for developing predictive models [10,15]. However, the ability to train an artificial neural network to accurately estimate BP from a video is constrained by a lack of publicly available iPPG databases. In contrast, cPPG databases are relatively abundant, and several studies have reported methods of improving the accuracy of heart rate measurements obtained from cPPG [2,4,16,17]. Cuffless methods of predicting BP from cPPG have been widely studied, including time feature selection [1,[18], [19], [20], [21]], a featureless method [20], and a wavelet transform approach [21]. These contact methods have made great progress, and there are now commercially available devices such as pulse oximeters and smart watches for non-invasive measurements of heart rate and BP [22,23]. Furthermore, it is not applicable to utilize algorithms for cPPG signal with iPPG signal. To overcome the limitations of iPPG data, Bousefsaf et al. proposed a method of converting iPPG signals to cPPG signals, thereby enabling the use of large existing databases for the estimation of BP from converted cPPG signals [24]. However, the results of their study may not be generalizable to a larger population or different experimental conditions. The same research group [25] has also reported a model for predicting the continuous wavelet transform (CWT) representation of a BP signal from the CWT of an iPPG signal, in which the BP time series was recovered based on an inverse CWT. The results showed that the use of a wavelet transform could represent the continuous time frequency of the iPPG signal, although only videos with clear iPPG signals were included in the dataset. In addition, the high frame rates of the videos used in these studies may not be representative of real-world scenarios.
Several attempts have been made to train machine learning models for the estimation of BP using iPPG signals obtained from self-created databases. Rong et al. collected facial videos of subjects via a webcam under ambient light conditions, and selected features to represent iPPG signals for training of a machine learning model [10]. Similarly, Luo et al. collected facial videos using a smartphone camera with LED illumination, and used the features extracted from facial blood flow signals to develop a BP prediction model [15]. These recent attempts have demonstrated that BP can be estimated from facial videos with promising performance, particularly when clear iPPG signals are present. It should be noted that iPPG signals are susceptible to various sources of noise, such as motion artifacts, which can distort the shape of the iPPG waveform [11,24], and the investigation of ways of representing iPPG signal morphology therefore remains an ongoing issue. Additionally, a documented approach involves the direct utilization of iPPG signals as input for a blood pressure model, eliminating the need for feature extraction. Li et al.'s study [26] demonstrated the effectiveness of deep learning methods in estimating blood pressure from iPPG waveforms. However, to enhance the model's predictive performance, the inclusion of personal information such as height, weight, gender, and BMI is necessary. On the contrary, Cheng et al. [27] relied solely on the iPPG signal for blood pressure prediction using a multi-stage deep learning model. This model integrated a Convolutional Neural Network (CNN) and a bidirectional Gated Recurrent Neural Network (BiGRU) across multiple stages, potentially requiring computational resources.
This study presents a non-contact method of measuring heart rate and BP using an iPPG signal. Facial videos were captured using a webcam, under ambient lighting conditions, and the heart rate and BP of each subject was simultaneously measured by an oscillometric BP meter. We demonstrate that the accuracy of heart rate derived from a modified iPPG signal is enhanced compared to results obtained using the reference device. Additionally, we introduce for the first time the application of wavelet transforms along with a small, lightweight Convolutional Neural Network (CNN) model for the direct blood pressure (BP) estimation from the iPPG signal. Moreover, we identify the optimal segment length of the iPPG signal for achieving accurate estimations of both heart rate and BP.
This paper is organized as follows. Section 2 describes our methodology, including the data collection and processing techniques used. The results of our experiments are presented in Section 3, followed by a detailed discussion in Section 4. Finally, Section 5 concludes the paper, with a summary of our findings and suggestions of avenues for future research.
2. Research method
2.1. Data collection
Adult volunteers (≥18 years) were recruited from Ubon Ratchathani University (Ubon Ratchathani, Thailand). The use of human subjects in this study was approved by the Ubon Ratchathani University Research Ethics Board (UBU-REC-136/2564). We collected data from 100 subjects, all of whom gave informed consent beforehand. Subjects were seated at the data collection system for at least 5 min. The system consisted of a computer and a webcam (Logitech C920C), which was set to a frame rate of 20 FPS and a resolution of 640x480 pixels. Subjects were asked to sit upright and to face the screen monitor of the measurement system under ambient light conditions, at about 30–40 cm from the screen, and were then instructed to slightly adjust the height and position of the chair to align their face with an elliptical shape displayed on the screen. For each subject, we took a video of length 30 s. We simultaneously collected heart rate and BP data using a digital blood pressure meter (HEM-7156-A, OMRON). The inflatable cuff of the digital BP meter was placed on the subject's upper left arm. As the measurements were taken, movement was restricted behind or around the subject to maintain camera focus and minimize vibrations.
2.2. iPPG signal acquisition
All videos were processed using the procedure shown in Fig. 1 to obtain the iPPG signal. Subjects provided consent for the publication of images. The subject's face was detected as shown by the yellow rectangle, using a pretrained MobileNet V2 with the dlib library. The forehead region was defined by the blue rectangle; this area was chosen because it provided a clearer and stronger iPPG signal, making it easier to extract and analyze the iPPG waveform [28,29]. To determine the positioning of the forehead, the yellow frame was used to calculate the proportions. The midpoint of the forehead was calculated as the distance from the left edge to the midpoint and 0.2 times the height from the top edge, resulting in the midpoint of the blue frame at (x, y). Subsequently, the blue frame was created by setting the top and bottom edges to y±0.1(y) and the left and right edges to x±0.18(x). This region of interest (ROI) was then cropped, resulting in an area 110–125 pixels in width and 40–60 pixels in height.
Fig. 1.
Data processing procedure.
The average luminosity of the green channels in the cropped area was calculated for each frame, to acquire the green channel signal (G). This signal was filtered by a bandpass filter, for which the cutoff frequencies were set to 0.7–4.0 Hz. The resulting filtered signals were smoothed using the moving averages method, and baseline removal was applied. The signals were then normalized to a value between zero and one to acquire iPPG1. For comparison, the plane-orthogonal-to-skin (POS) technique was employed to extract the iPPG signal from the skin pixels [6], and the same procedures were then applied to obtain iPPG2. Fig. 2 shows the iPPG signals for two subjects: it can be observed that the iPPG1 (Fig. 2(a)) and iPPG2 (Fig. 2 (b)) signals for subject A are similar, while some of the minor peaks in the iPPG1 signal for subject B are absent from iPPG2. To retain some of the characteristics of iPPG1 after employing the POS technique, we calculated iPPG3 by averaging iPPG1 and iPPG2, as shown in Fig. 2(c). These procedures were performed using an in-house program utilizing Python libraries and functions.
Fig. 2.
Examples of iPPG signals obtained from two subjects:
(a) iPPG1; (b) iPPG2; (c) iPPG3.
2.3. Determination of heart rate from the iPPG signal
A fast Fourier transformation (FFT) was applied to transform the iPPG signal from the time domain to the frequency domain, and the frequency with maximum magnitude was determined and then multiplied by 60 to give the heart rate in beats per minute (BPM) [2]. These processes were implemented using the OpenCV library in Python. The heart rates obtained from the iPPG1, iPPG2 and iPPG3 signals are referred to in the following as HR1, HR2 and HR3, respectively.
2.4. Continuous wavelet transform of the iPPG signal
We used a CWT of the iPPG signal to train the CNN. A CWT exploits the modulating characteristics of the mother wavelet to represent signals in the form of an analytical equation, as in Eq. (1):
| (1) |
where f(t) is the input signal, and is mother wavelet function. The mother wavelet is a functional set that can be used as a basis to represent any signal through a wavelet transform. The function of the mother wavelet is defined by a scaling parameter a and a shifting parameter b, which are used to adjust the scale and position of the wavelet function, respectively, as shown in Eq. (2):
| (2) |
In this study, each iPPG signal was partitioned into three successive segments of length 3 s, and the CWT of each segment was computed using the PyWavelets (pywt) function in Python. Inspired by a study by Wu et al. [21], which demonstrated the effectiveness of employing the complex Gaussian wavelet to represent cPPG for predicting blood pressure using a CNN, we adopted the complex Gaussian wavelet as the mother wavelet in our work. The scale was set to Refs. [1,32], with corresponding to the frequency of heart rate. The resulting CWT images were subsequently transformed into contours at a resolution of 60 × 60 pixels. To determine the most appropriate duration for the iPPG signal, additional CWT analyses were conducted using signal durations of 6 and 9 s. Fig. 3(a–c) shows examples of CWT images obtained from iPPG signals with segments of length 3, 6, and 9 s, respectively.
Fig. 3.
Examples of CWT images for iPPG3: (a) 3 s segment; (b) 6 s segment; (c) 9 s segment.
2.5. Blood pressure prediction model
In this study, we utilized a CNN to learn the CWT image and predict BP. A CNN is a type of artificial neural network that is specifically designed to process data with a grid-like topology, such as an image. It is composed of multiple layers of interconnected neurons, with each layer processing the input data and passing the results to the next layer. A custom architecture was designed in order to create a compact model that was easier to train, and to avoid unnecessary constraints. A CNN with the network architecture illustrated in Fig. 4 was specifically designed to analyze CWT images of iPPG signals with dimensions of 60 × 60 pixels. The convolutional layer was composed of 128 kernels with a size of 3x3, a stride of one, and same padding, and a rectified linear unit (ReLU) was used for activation. The pooling layer employed 128 kernels of size 2x2, a stride of two, and same padding, to downsample the data and capture important features. The dense layer was fully connected, and consisted of 32 neurons with ReLU activation. The output of the CNN consisted of two values: the systolic blood pressure (SBP), and the diastolic blood pressure (DBP).
Fig. 4.
Structure of the CNN model.
From Table 1, it can be seen that the process of model training involved the use of approximately 3.6 million parameters, illustrating its lightweight nature. The dataset was drawn from 100 volunteers, who were partitioned into a training group (n = 80) and a testing group (n = 20). As mentioned above, each sample was segmented into three parts, yielding three CWT images per sample. Thus, the training dataset comprised 240 images, while the testing dataset contained 60 images. The efficacy of the proposed method was evaluated based on the accuracy of the predictions. The model was implemented using the TensorFlow and Keras libraries in Python.
Table 1.
Parameters of the CNN model.
| Layer (type) | Output shape | No. of parameters |
|---|---|---|
| conv2d (Conv2D) | (None, 60, 60, 128) | 3584 |
| max_pooling2d (MaxPooling2D) | (None, 30, 30, 128) | 0 |
| flatten (Flatten) | (None, 115200) | 0 |
| dense (Dense) | (None, 32) | 3,686,432 |
| dropout (Dropout) | (None, 32) | 0 |
| dense_1 (Dense) | (None, 2) | 66 |
| Total parameters: 3,690,082 | ||
| Trainable parameters: 3,690,082 | ||
| Non-trainable parameters: 0 |
K-fold cross-validation was conducted to investigate the optimal segment length for the iPPG signal, to ensure accurate blood pressure estimation by our CNN model. CWT images generated from segments of different lengths (3, 6, and 9 s) were passed as input to the CNN model. The test dataset was reserved for the final evaluation, and the training dataset was split into five non-overlapping folds to facilitate k-fold cross-validation. This resulted in a training set of 192 data points, collected from 64 subjects, while the validation set consisted of 48 data points from 16 subjects. At each iteration of the training process, the MAE and RMSE were computed for the model, and the mean absolute error (MAE) and root mean squared error (RMSE) values for the five-fold cross-validation were subsequently determined for a comprehensive assessment of the model.
3. Results
3.1. Heart rate
The heart rates HR1, HR2 and HR3 were extracted from the iPPG1, iPPG2 and iPPG3 signals, respectively. These results were compared with a reference value measured with an oscillometric blood pressure meter (HRR). The parameters of the linear regression equation and the error metrics are shown in Table 2. It can be seen that HR3 obtained from the iPPG3 signal is highly consistent with the heart rate measured by the oscillometric BP meter, with R2 = 0.90. The slope of the linear regression equation (m) is closest to one for this signal, and the y-intercept (c) is smallest. Although the mean error (ME) of HR3 is slightly higher than that of HR2, its standard deviation (SD), MAE and RMSE are lowest.
Table 2.
Accuracy of heart rate results based on three different iPPG signals.
| Signal | y = mx + c | R2 | ME ± SD (BPM) | MAE (BPM) | RSME (BPM) | Remarks |
|---|---|---|---|---|---|---|
| iPPG1 | y = 0.80x+15.15 | 0.44 | 0.82 ± 10.69 | 4.38 | 10.67 | Green channel |
| iPPG2 | y = 0.91x+7.84 | 0.80 | −0.79 ± 5.34 | 2.67 | 5.37 | POS method |
| iPPG3 | y = 0.93x+6.57 | 0.90 | −1.18 ± 3.73 | 2.32 | 3.89 | Average of iPPG1 and iPPG2 |
y: HR obtained from iPPG, x: HR measured by oscillometric blood pressure meter.
To analyze the agreement between the heart rates derived from the BP monitor and the iPPG signal, Bland-Altman plots were created, as shown in Fig. 5. It was found that more than 95% of the data for both HR2 and HR3 fell within the range of ±1.96 times the SD of the differences (indicated by the dotted line). Although the results in Fig. 5(b) and (c) look similar, and both show good agreement with the reference method, HR3 has the narrowest confidence interval; hence, iPPG3 was used in the remainder of the study.
Fig. 5.
Bland-Altman plots showing agreement between heart rate measured with an oscillometric blood pressure meter (HRR) and heart rate obtained from: (a) iPPG1; (b) iPPG2; (c) iPPG3.
To find the most appropriate time duration of the iPPG signal for obtaining the heart rate, iPPG3 signals with durations of 3, 6, 9, 15, and 30 s were evaluated. The results in Table 3 demonstrate that a longer time duration yielded higher accuracy in the estimation of heart rate. Notably, the highest accuracy was achieved for a duration of 30 s.
Table 3.
Accuracy of heart rates for different time durations.
| PERIOD (s) | ME±SD (BPM) | MAE (BPM) | RMSE (BPM) | R2 |
|---|---|---|---|---|
| 3 | 1.70 ± 11.14 | 6.96 | 11.26 | 0.41 |
| 6 | 0.41 ± 8.44 | 4.49 | 8.44 | 0.75 |
| 9 | 1.09 ± 8.16 | 3.79 | 8.22 | 0.80 |
| 15 | 0.11 ± 6.24 | 2.75 | 6.22 | 0.83 |
| 30 | −1.18 ± 3.73 | 2.32 | 3.89 | 0.90 |
3.2. Blood pressure
Table 4 presents the results of a performance evaluation of various CNN models using CWT images with different durations for the estimation of BP. The findings indicate that the mean MAE and RMSE values for a CWT duration of 3 s yielded the highest values for both SBP and DBP estimates. Although the mean MAE values for the SBP estimates did not differ significantly between durations of 6 and 9 s for the CWT, the mean MAE for the DBP estimate was found to be the lowest for a duration of 6 s. Moreover, the mean RMSE values for both SBP and DBP were minimized for a duration of 6 s. Thus, CWT images with a duration of 6 s were considered the most appropriate input to the CNN model in terms of accurately estimating BP.
Table 4.
Results of k-fold cross validation.
| SBP |
DBP |
||||||||
|---|---|---|---|---|---|---|---|---|---|
| Fold | ME | SD | MAE | RMSE | ME | SD | MAE | RMSE | |
| 3 s | 1 | 2.23 | 9.77 | 7.37 | 9.94 | −3.28 | 3.9 | 4.12 | 5.07 |
| 2 | −0.38 | 10.81 | 8.35 | 10.73 | 2.07 | 4.02 | 3.9 | 4.49 | |
| 3 | 0.47 | 1.03 | 6.67 | 9.28 | 9.35 | 3.73 | 2.97 | 3.84 | |
| 4 | 6.73 | 9.32 | 8.4 | 11.44 | 4.46 | 3.98 | 4.97 | 5.96 | |
| 5 | 3.12 | 9.56 | 7.11 | 9.98 | 0.82 | 3.82 | 3.15 | 3.87 | |
| AVG | 2.43 | 8.10 | 7.58 | 10.27 | 2.68 | 3.89 | 3.82 | 4.65 | |
| 6 s | 1 | 3.38 | 8.71 | 7.38 | 9.28 | −0.46 | 3.76 | 3.03 | 3.76 |
| 2 | −1.08 | 8.31 | 6.45 | 8.32 | 1.18 | 3.91 | 3.18 | 4.05 | |
| 3 | 2.63 | 8.55 | 7.13 | 8.88 | 3.58 | 3.89 | 4.32 | 5.26 | |
| 4 | 3.47 | 9.05 | 6.7 | 9.61 | 0.62 | 3.91 | 3.12 | 3.92 | |
| 5 | 3.68 | 8.77 | 6.98 | 9.44 | 2.81 | 3.48 | 3.48 | 4.45 | |
| AVG | 2.42 | 8.68 | 6.93 | 9.11 | 1.55 | 3.79 | 3.43 | 4.29 | |
| 9 s | 1 | 0.48 | 8.42 | 6.22 | 8.37 | −3.32 | 3.61 | 3.85 | 4.88 |
| 2 | 0.45 | 7.95 | 5.88 | 7.89 | 2.56 | 3.54 | 3.63 | 4.35 | |
| 3 | −0.8 | 9.88 | 7.43 | 9.83 | 2.77 | 3.73 | 4.06 | 4.62 | |
| 4 | 6.33 | 8.63 | 8 | 10.65 | −1.15 | 3.9 | 3.22 | 4.04 | |
| 5 | 5.12 | 8.37 | 6.93 | 9.78 | 2.65 | 3.41 | 3.55 | 4.29 | |
| AVG | 2.32 | 8.65 | 6.89 | 9.30 | 0.70 | 3.64 | 3.66 | 4.44 | |
To evaluate the effectiveness of the proposed model, a final assessment was carried out using the training and test datasets for a CWT duration of 6 s. The performance of the model was evaluated by analyzing the correlation between the predicted and actual values for both SBP and DBP on both datasets, as illustrated in Fig. 6(a and b) and Fig. 7(a and b). The training results showed a strong correlation between the predicted and actual values, although the testing results showed a slightly lower correlation, suggesting potential variations in performance when dealing with unseen data. The agreement between the measured and predicted SBP and DBP values derived from the CNN model is shown in the form of Bland-Altman plots in Fig. 8(a and b), and Table 5 provides error metrics for the BP prediction model for the SBP and DBP. The mean differences between the predicted and actual values were determined as 2.50 mmHg for the SBP and −1.97 mmHg for the DBP. For the SBP, the confidence interval was calculated as 2.50 ± 18.65 mmHg (indicated by the dotted lines), with 91.6% of the data points lying within this interval, which is slightly lower than the expected 95% confidence level. For the DBP, the confidence interval was −1.97 ± 9.05 mmHg, with 95% of the data falling within this interval (see Fig. 8).
Fig. 6.
Results from the CNN on the training dataset using the wavelet transform of the iPPG signal for a duration of 6 s: (a) systolic blood pressure; (b) diastolic blood pressure.
Fig. 7.
Results from the CNN on the testing dataset using the wavelet transform of the iPPG signal for a duration of 6 s: (a) systolic blood pressure; (b) diastolic blood pressure.
Fig. 8.
Bland-Altman plots showing the results from the CNN model on the testing dataset:
(a) systolic blood pressure; (b) diastolic blood pressure.
Table 5.
Error metrics for the blood pressure prediction model.
| Input | SBP |
DBP |
||||||
|---|---|---|---|---|---|---|---|---|
| ME±1.96 SD | MAE | RMSE | R | ME±1.96 SD | MAE | RMSE | R | |
| 2.50 ± 18.65 | 6.40 | 8.68 | 0.73 | −1.97 ± 9.05 | 3.83 | 4.98 | 0.60 | |
The preliminary findings of this study suggest that wavelet conversion of iPPG signals is a promising approach for developing a BP prediction model. The accuracy of this model was evaluated based on the criteria of the British Hypertension Society (BHS) and the Association for the Advancement of Medical Instrumentation (AAMI), and the results are presented in Table 6, Table 7, respectively.
Table 6.
BHS standard.
| Cumulative error percentage |
||||
|---|---|---|---|---|
| ≤5 mmHg | ≤10 mmHg | ≤15 mmHg | ||
| BHS | Grade A | 60% | 85% | 95% |
| Grade B | 50% | 75% | 90% | |
| Grade C | 40% | 65% | 85% | |
| Our results | SBP | 56.66% | 81.66% | 90% |
| DBP | 76.66% | 96.66% | 100% | |
Table 7.
AAMI standard.
| ME (mmHg) | STD (mmHg) | Number of subjects | ||
|---|---|---|---|---|
| AAMI standard | ≤5 | ≤8 | ≥85 | |
| Our results | SBP | 2.50 | 5.09 | 20 |
| DBP | −1.97 | 4.58 | 20 |
Table 6 shows the BHS standard, in which the absolute error is evaluated by counting the frequency of absolute errors within 5, 10, and 15 mmHg, respectively, and the percentage of data within each error range is calculated to assess the performance of the device. The results for this standard showed that SBP was graded B, while DBP was graded A.
Table 7 presents the results for the prediction of SBP and DBP values based on the AAMI standard conditions, using the average error and the SD as criteria, with thresholds of less than 5 and 8 mmHg, respectively. The average error and SD for both the SBP and DBP were found to be lower than the threshold in the AAMI standard, indicating good performance of the developed waveform transformation-based prediction model. However, it should be noted that the number of subjects was lower than the AAMI recommended threshold, which states that at least 85 subjects should participate in the experiment.
4. Discussion
This article has presented a comparison between heart rate and BP values obtained from iPPG signals (derived from webcam-recorded videos under ambient light) and those obtained from an oscillometric blood pressure meter. Data were collected from 100 participants. The results show that the highest accuracy for determining heart rate using iPPG signals was achieved when utilizing a duration of 30 s; however, BP can be estimated with a shorter segment length of 6 s. In contrast, traditional oscillometric blood pressure meters typically require longer durations of 30–40 s to obtain both heart rate and BP values.
In general, oscillometric techniques compute the heart rate by counting the number of pulses within the time period between the first and last troughs of the oscillometric wave [30]. The oscillometric BP meter used in this study had an error margin of ±5% in the heart rate reading, whereas for heart rate ranges of 60–130 BPM, the error margin ranged from ±3.0 to ±6.5 BPM. In this study, the heart rate was derived from the dominant frequency of the iPPG signal. To enhance the extraction of the iPPG signal, the POS method was employed, although this approach may encounter difficulties in distinguishing between the pulsatile and noise components, especially when their amplitude levels are similar on their respective planes [6]. To address this issue, we introduced an additional step in which we incorporated the iPPG signal obtained from the green channel into the iPPG signal obtained from the POS method. This modification improved the results in terms of accurately determining the heart rate. Our experimental results demonstrated that the proposed method of determining heart rate had a low level of imprecision when measuring over 30 s, with variation within the range ±5 BPM. However, due to the existence of local peaks near the maximum peak, as for example in the iPPG3 signal for subject B in Fig. 2, it can be noted that shorter measuring times tend to give rise to higher errors. This is because the influence of these false peaks becomes more pronounced in the frequency domain within shorter time intervals compared to longer ones. These findings align with those of a study by Jensen and Hannemose [31], which reported a strong correlation between the heart rate obtained from the green channel of 30 s videos of 12 participants and the heart rate derived from the cPPG signal. Although the reference heart rate was obtained from different type of measuring device, they found that using a window length of shorter than 10 s was not advisable, as it led to a significant increase in bias. In a similar study, Viejo et al. [8] developed a machine learning model to predict heart rate based on raw video analysis and compared the results with those from an oscillometric monitor. Their machine learning model considered three face regions and involved 15 participants, resulting in a correlation coefficient (R) of 0.85. The findings of this study indicate that a non-contact method can be employed for heart rate determination with accurate results, and is feasible for use in screening applications.
The extraction of BP features from PPG signals using waveform morphology is a complex and intricate process, and a CWT was therefore used to represent the continuous time frequency of the PPG signal [21,25,32]. Different segment lengths were considered in a study investigating BP through the use of a CWT with PPG signals. Liang et al. previously reported the successful classification of hypertension by analyzing 5 s segments of cPPG using a CWT [30], whereas Wu et al. found that CWTs performed optimally with segment lengths of 2.0 and 2.4 s for cPPG [21]. To represent iPPG signals, Bousefsaf et al. demonstrated that a segment length of 2.56 s was adequate for predicting the CWT for BP [25]. In our study, we found that using a segment length of 6 s for iPPG led to better performance of the model compared to 3 s and 9 s. Although our finding contradicts the results reported by Bousefsaf et al., this emphasizes the effectiveness of CWT as a method for representing iPPG signals in predicting BP. Furthermore, similarly to the observations made by Wu et al., we noted that using shorter segment lengths decreased the prediction accuracy.
Moreover, it can be demonstrated that the variability in blood pressure prediction error is associated with differences in the morphology of the iPPG signal. Fig. 9, the MAE, absolute maximum error, and absolute minimum error of predicted blood pressure values for each subject in the test dataset are presented. Notably, subject #19 exhibits the smallest errors in both SBP and diastolic DBP values, whereas subject #12 demonstrates the highest errors in both SBP and DBP values. Therefore, the iPPG signal and its CWT image for three distinct segments of both subjects were shown Fig. 10, Fig. 11, respectively.
Fig. 9.
Evaluation of predicted blood pressure accuracy for each subject in the test dataset. The circle mark signifies mean absolute error, and the error bar indicates the range of errors (maximum and minimum). Blue circle marks represent systolic blood pressure (SBP), while orange circle marks represent diastolic blood pressure (DBP). (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)
Fig. 10.
iPPG signal and its Continuous Wavelet Transform (CWT) images for three segments of the subject#19, showcasing closely predicted values. The following subplots are presented: (a) iPPG segment 1, (b) iPPG segment 2, (c) iPPG segment 3, (d) CWT of iPPG segment 1, (e) CWT of iPPG segment 2, and (f) CWT of iPPG segment 3.
Fig. 11.
iPPG signal and its Continuous Wavelet Transform (CWT) images for three segments of the subject#12, highlighting instances of the highest prediction errors. The subplots include: (a) iPPG segment 1, (b) iPPG segment 2, (c) iPPG segment 3, (d) CWT of iPPG segment 1, (e) CWT of iPPG segment 2, and (f) CWT of iPPG segment 3.
The morphology of the iPPG signal exhibited variations between Fig. 10(a–c) and Fig. 11(a–c), particularly evident in the absence of certain characteristics in the valley of the iPPG signal in Fig. 11. Consequently, in Fig. 11(d–f), the Continuous Wavelet Transform (CWT) images highlighted a noticeable lack of the frequency component around 1–2 Hz. In contrast, Fig. 10(d–f) depicted a distinctly clear presence of the low-frequency component. Various factors, including differences in skin color, skin thickness, and the fixed video-capture setup, may contribute to the disparities in derived data. Hence, a future challenge is to create adaptive procedures for each individual to tackle the issue of missing information.
When the results of this study were compared with those of other research on BP estimation with iPPG signals (Table 8), it was found that our data collection method was similar to that of Rong et al. [10], who used a webcam and an oscillometric blood pressure meter. Their study used 16 features obtained from iPPG waveform, including heart rate, as input data for a support vector regression to predict BP values. Another study by Luo et al. carried out continuous non-invasive BP monitoring to measure changes in blood flow in the finger, providing data on BP changes and pulse signal for a large sample of 1328 volunteers [15]. They considered 155 important feature values, which were reduced to 29 and passed as input data to a prediction model. The model predicted SBP with an accuracy of 94.81% and DBP with an accuracy of 95.71%, although the MAE values were not provided. Bousefsaf et al. utilized data from the BP4D + database, which includes video and blood pressure time series. While they employed CWT images of iPPG and a CNN for blood pressure prediction, it's noteworthy that the CNN did not directly provide blood pressure values. Instead, it generated CWT images of blood pressure, and the blood pressure time series was derived by inverting the CWT. Furthermore, when compared to the approach that directly employs iPPG signals as input for a blood pressure model without the necessity of feature extraction, Li et al.'s study [26] showcased the efficacy of deep learning methods in estimating blood pressure from iPPG waveforms, along with the inclusion of personal information such as height, weight, gender, and BMI. In contrast, Cheng et al. [27] introduced a multi-stage deep learning model for blood pressure prediction based on iPPG signals. Our model was found to have better performance than that of Rong et al. and Li et al., and was comparable to that of Bousefsaf et al. and Cheng et al. However, we had a smaller dataset than other studies. In addition, a comparison of methods across studies is difficult due to the disparate nature of the datasets. Each study described above used different devices and setups, and included populations with a diverse range of skin tones and races. Although our study and that of Rong et al., Li et al., and Cheng et al. used oscillometric BP meters, it is important to consider that different devices may rely on different algorithms, which may impact the accuracy of the measurements [33].
Table 8.
Comparison of results with other studies using iPPG signals.
| Study | Dataset | Input | Results |
|||||
|---|---|---|---|---|---|---|---|---|
| BHS standard |
AAMI standard |
MAE (mmHg) |
||||||
| SBP | DBP | SBP | DBP | SBP | DBP | |||
| Luo et al. [15] | Self-made video (1328 subjects) | 29 features | – | – | ✓ | ✓ | – | – |
| Rong et al. [10] | Self-made video (191 subjects) | 16 features | C | B | ✓ | ✓ | 9.97 | 7.59 |
| Bousefsaf et al. [25] | BP4D + database (57 subjects) | CWT image | B | A | ✓a ✓a | 6.73 | 5.1 | |
| Li et al. [26] | Self-made video (814 subjects) | iPPG signals | C | B | - - | 8.36 | 5.69 | |
| Cheng et al. [27] | Self-made video (115 subjects) | iPPG signals | B | A | ✓a ✓a | 5.33 | 4.02 | |
| Our study | Self-made video (100 subjects) | CWT image | B | A | ✓a ✓a | 6.40 | 3.83 | |
indicates that the number of participants in the study was lower than 85.
In this study, we employed a custom lightweight CNN model rather than using pre-trained models from ImageNet. While pre-trained models can be fine-tuned for efficient performance, it is not always necessary to use highly trained models [34,35]. However, the performance of CNN models may vary based on the hyperparameter values, such as the model depth, optimizer, loss function, and preprocessing steps [36,37]. In addition, it is important to consider the limitations of the current dataset, which contained data on a small number of subjects, and the controlled laboratory environment in which the data were recorded. Although facial videos were captured under ambient light conditions, this was carried out within a controlled laboratory environment setting, and movement was restricted during measurements to maintain camera focus and minimize vibrations.
To improve the practicality and applicability of the proposed method, the dataset could be expanded to include a larger and more diverse set of subjects, which may affect the accuracy of both heart rate and BP estimation. Future work should investigate the possibility of developing our estimation method using data collected on-site, in a real-world setting, to validate its effectiveness in practical clinical scenarios.
5. Conclusions
This article has introduced a non-contact method for measuring heart rate and BP using iPPG signals obtained from webcam-recorded videos under ambient light conditions, and our results were compared to those of an oscillometric blood pressure meter. our study demonstrates the effectiveness of a modified iPPG signal acquisition method, showcasing its capability to achieve higher accuracy in determining heart rate. We also showed the direct estimation of BP by utilizing the CWT image of the iPPG signal and a compact CNN model. The findings suggest that heart rate and BP can be accurately estimated using segment lengths of 30 s and 6 s, respectively. Our model achieved grade A for DBP and grade B for SBP, based on the criteria set by BHS. In addition, the ME and SD for the SBP and DBP also satisfied the criteria set by the AAMI. The proposed framework can enhance the efficiency of non-contact screening approaches. Further research should focus on validating the effectiveness of the estimation method by utilizing real-world data collected on site.
Declaration
The use of human subjects in this study was approved by the Ubon Ratchathani University Research Ethics Board (UBU-REC-136/2564).
Data availability statement
The data supporting the findings of this study are available upon request from the corresponding author. Due to ethical restrictions, the data are not publicly available.
CRediT authorship contribution statement
Suchin Trirongjitmoah: Writing – original draft, Resources, Methodology, Funding acquisition, Conceptualization. Arphorn Promking: Software, Methodology, Investigation, Data curation. Khanittha Kaewdang: Writing – review & editing, Supervision. Nisarut Phansiri: Visualization, Validation. Kriengsak Treeprapin: Writing – review & editing, Supervision, Investigation.
Declaration of competing interest
The authors declare the following financial interests/personal relationships which may be considered as potential competing interests:Suchin Trirongjitmoah reports financial support was provided by National Science, Research and Innovation Fund of Thailand. If there are other authors, they declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This research was supported by the National Science, Research and Innovation Fund of Thailand.
References
- 1.Mousavi S.S., Firouzmand M., Charmi M., Hemmati M., Moghadam M., Ghorbani Y. Blood pressure estimation from appropriate and inappropriate PPG signals using a whole-based method. Biomed. Signal Process Control. 2019;47:196–206. doi: 10.1016/j.bspc.2018.08.022. [DOI] [Google Scholar]
- 2.Lei R., Ling B.W.-K., Feng P., Chen J. Estimation of heart rate and respiratory rate from PPG signal using complementary ensemble empirical mode decomposition with both independent component analysis and non-negative matrix factorization. Sensors. 2020;20(11):3238. doi: 10.3390/s20113238. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Le T., et al. Continuous non-invasive blood pressure monitoring: a methodological review on measurement techniques. IEEE Access. 2020;8:212478–212498. doi: 10.1109/ACCESS.2020.3040257. [DOI] [Google Scholar]
- 4.Wójcikowski M., Pankiewicz B. Photoplethysmographic time-domain heart rate measurement algorithm for resource-constrained wearable devices and its implementation. Sensors. 2020;20(6):1783. doi: 10.3390/s20061783. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Sun Y., Thakor N. Photoplethysmography revisited: from contact to noncontact, from point to imaging. IEEE Trans. Biomed. Eng. 2016;63(3):463–477. doi: 10.1109/TBME.2015.2476337. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Wang W., den Brinker A.C., Stuijk S., de Haan G. Algorithmic principles of remote PPG. IEEE Trans. Biomed. Eng. 2017;64(7):1479–1491. doi: 10.1109/TBME.2016.2609282. [DOI] [PubMed] [Google Scholar]
- 7.Cennini G., Arguel J., Akşit K., van Leest A. Heart rate monitoring via remote photoplethysmography with motion artifacts reduction. Opt Express. 2010;18(5):4867. doi: 10.1364/OE.18.004867. [DOI] [PubMed] [Google Scholar]
- 8.Gonzalez Viejo C., Fuentes S., Torrico D., Dunshea F. Non-contact heart rate and blood pressure estimations from video analysis and machine learning modelling applied to food sensory responses: a case study for chocolate. Sensors. 2018;18(6):1802. doi: 10.3390/s18061802. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Premkumar S., Hemanth D.J. Intelligent remote photoplethysmography-based methods for heart rate estimation from face videos: a survey. Informatics. 2022;9(3):57. doi: 10.3390/informatics9030057. [DOI] [Google Scholar]
- 10.Rong M., Li K. A Blood pressure prediction method based on imaging photoplethysmography in combination with machine learning. Biomed. Signal Process Control. 2021;64 doi: 10.1016/j.bspc.2020.102328. [DOI] [Google Scholar]
- 11.Fan X., Ye Q., Yang X., Choudhury S.D. Robust blood pressure estimation using an RGB camera. J. Ambient Intell. Hum. Comput. 2020;11(11):4329–4336. doi: 10.1007/s12652-018-1026-6. [DOI] [Google Scholar]
- 12.Adachi Y., Edo Y., Ogawa R., Tomizawa R., Iwai Y., Okumura T. 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) IEEE; Berlin, Germany: Jul. 2019. Noncontact blood pressure monitoring technology using facial photoplethysmograms; pp. 2411–2415. [DOI] [PubMed] [Google Scholar]
- 13.Jeong I.C., Finkelstein J. Introducing contactless blood pressure assessment using a high speed video camera. J. Med. Syst. 2016;40(4):77. doi: 10.1007/s10916-016-0439-z. [DOI] [PubMed] [Google Scholar]
- 14.Rizal A., Lin Y.-C., Lin Y.-H. 3rd International Conference on Intelligent Green Building and Smart Grid (IGBSG) IEEE; Yi-Lan: Apr. 2018. Contactless vital signs measurement for self-service healthcare kiosk in intelligent building; pp. 1–4. [DOI] [Google Scholar]
- 15.Luo H., et al. Smartphone-based blood pressure measurement using transdermal optical imaging technology. Circ: Cardiovascular Imaging. 2019;12(8) doi: 10.1161/CIRCIMAGING.119.008857. [DOI] [PubMed] [Google Scholar]
- 16.Askarian B., Jung K., Chong J.W. Monitoring of heart rate from photoplethysmographic signals using a Samsung Galaxy Note 8 in underwater environments. Sensors. 2019;19(13):2846. doi: 10.3390/s19132846. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Ismail S., Akram U., Siddiqi I. Heart rate tracking in photoplethysmography signals affected by motion artifacts: a review. EURASIP J. Appl. Signal Process. 2021;2021(1):5. doi: 10.1186/s13634-020-00714-2. [DOI] [Google Scholar]
- 18.El-Hajj C., Kyriacou P. Deep learning models for cuffless blood pressure monitoring from PPG signals using attention mechanism. Biomed. Signal Process Control. 2021;65 doi: 10.1016/j.bspc.2020.102301. [DOI] [Google Scholar]
- 19.Qiu Y., et al. Cuffless blood pressure estimation based on composite neural network and graphics information. Biomed. Signal Process Control. 2021;7 doi: 10.1016/j.bspc.2021.103001. [DOI] [Google Scholar]
- 20.Li Y.-H., Harfiya L.N., Chang C.-C. Featureless blood pressure estimation based on photoplethysmography signal using CNN and BiLSTM for IoT devices. Wireless Commun. Mobile Comput. 2021;2021:1–10. doi: 10.1155/2021/9085100. [DOI] [Google Scholar]
- 21.Wu J., Liang H., Ding C., Huang X., Huang J., Peng Q. Improving the accuracy in classification of blood pressure from photoplethysmography using continuous wavelet transform and deep learning. Int. J. Hypertens. 2021;2021:1–9. doi: 10.1155/2021/9938584. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Tamura T., Maeda Y., Sekine M., Yoshida M. Wearable photoplethysmographic sensors—past and present. Electronics. 2014;3(2):282–302. doi: 10.3390/electronics3020282. [DOI] [Google Scholar]
- 23.Konstantinidis D., et al. Wearable blood pressure measurement devices and new approaches in hypertension management: the digital era. J. Hum. Hypertens. 2022;36(11):945–951. doi: 10.1038/s41371-022-00675-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Bousefsaf F., Djeldjli D., Ouzar Y., Maaoui C., Pruski A. iPPG 2 cPPG: reconstructing contact from imaging photoplethysmographic signals using U-Net architectures. Comput. Biol. Med. 2021;138 doi: 10.1016/j.compbiomed.2021.104860. [DOI] [PubMed] [Google Scholar]
- 25.Bousefsaf F., Desquins T., Djeldjli D., Ouzar Y., Maaoui C., Pruski A. Estimation of blood pressure waveform from facial video using a deep U-shaped network and the wavelet representation of imaging photoplethysmographic signals. Biomed. Signal Process Control. 2022;78 doi: 10.1016/j.bspc.2022.103895. [DOI] [Google Scholar]
- 26.Li Y., Wei M., Chen Q., Zhu X., Li H., Wang H., Luo J. Hybrid D1DCnet using forehead iPPG for continuous and noncontact blood pressure measurement. IEEE Sensor. J. 2022;23(3):2727–2736. [Google Scholar]
- 27.Cheng H., Xiong J., Chen Z., Chen J. Deep learning-based non-contact iPPG signal blood pressure measurement research. Sensors. 2023;23(12):5528. doi: 10.3390/s23125528. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Verkruysse W., Svaasand L.O., Nelson J.S. Remote plethysmographic imaging using ambient light. Opt Express. 2008;16(26) doi: 10.1364/OE.16.021434. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Wang W., Shan C. Impact of makeup on remote-PPG monitoring. Biomed. Phys. Eng. Express. 2020;6(3) doi: 10.1088/2057-1976/ab51ba. [DOI] [PubMed] [Google Scholar]
- 30.Kumar S., Yadav S., Kumar A. Oscillometric waveform evaluation for blood pressure devices. Biomedical Engineering Advances. 2022;4 [Google Scholar]
- 31.Jensen J.N., Hannemose M. Technical University of Denmark, Department of Applied Mathematics and Computer Science. DTU Computer: Lyngby; Denmark: 2014. Camera-based heart rate monitoring. 17. [Google Scholar]
- 32.Liang Y., Chen Z., Ward R., Elgendi M. Photoplethysmography and deep learning: enhancing hypertension risk stratification. Biosensors. 2018;8(4):101. doi: 10.3390/bios8040101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Jilek J., Fukushima T. Oscillometric blood pressure measurement: the methodology, some observations, and suggestions. Biomedical Instrumentation. 2005;39(3):237–241. doi: 10.2345/0899-8205(2005)39[237:OBPMTM]2.0.CO;2. [DOI] [PubMed] [Google Scholar]
- 34.Alzubaidi L., et al. Deepening into the suitability of using pre-trained models of ImageNet against a lightweight convolutional neural network in medical imaging: an experimental study. Peer J. Computer Science. 2021;7:e715. doi: 10.7717/peerj-cs.715. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Rastegar S., Gholam Hosseini H., Lowe A. Hybrid CNN-SVR blood pressure estimation model using ECG and PPG Signals. Sensors. 2023;23(3):1259. doi: 10.3390/s23031259. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Schrumpf F., Serdack P.R., Fuchs M. IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) IEEE; New Orleans, LA, USA: Jun. 2022. Regression or classification? Reflection on BP prediction from PPG data using deep neural networks in the scope of practical applications; pp. 2171–2180. [DOI] [Google Scholar]
- 37.Khan A., Khan S.H., Saif M., Batool A., Sohail A., Waleed Khan M. A survey of deep learning techniques for the analysis of COVID-19 and their usability for detecting omicron. J. Exp. Theor. Artif. Intell. 2023:1–43. doi: 10.1080/0952813X.2023.2165724. [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The data supporting the findings of this study are available upon request from the corresponding author. Due to ethical restrictions, the data are not publicly available.











