Skip to main content
The Journal of Clinical Hypertension logoLink to The Journal of Clinical Hypertension
. 2020 Jun 4;22(6):1098–1100. doi: 10.1111/jch.13914

DXA measured body composition predicts blood pressure using machine learning methods

Tanmay Nath 1,, Rexford S Ahima 2, Prasanna Santhanam 2
PMCID: PMC8030107  PMID: 32497407

1. INTRODUCTION

Elevated systolic and ambulatory blood pressure (BP) has a significant association with cardiovascular mortality. 1 , 2 Prior studies have looked at different predictors of systolic BP including age, ethnicity, and body mass index (BMI). 3 , 4 Recently, machine learning techniques have been recently employed in different facets of biomedical and health care research. In this study, we have used different supervised machine learning models to predict average systolic and diastolic BP from age, dual‐energy X‐ray absorptiometry (DXA) measured body composition parameters, BMI, and waist circumference. We used the data from National Health and Nutrition Examination Survey (NHANES) 1999‐2004 that involved DXA scans which were acquired using the Hologic QDR 4500A fan‐beam densitometers (Hologic, Inc) The following data were compiled and tabulated for (a) age at screening, (b) initial BMI (kg/m2), (c) different metrics of DXA measured body composition (truncal, total, subtotal (minus head), skin thickness, etc), and (d) waist circumference (in cm). The list of variables used for the machine learning models (classified by their category) is shown in the Table (Table 1). The average systolic and diastolic BP measurements (averaged over multiple readings) were evaluated as dependent variables. We selected 9803 patients (after excluding patients with missing information) in the age range from 18 to 75 years and converted the raw data into a structured data frame which was fed into the various machine learning models. All the features in the entire dataset were normalized using.

X=X-μσ

where X is the original feature vector, μ is the mean of the feature vector, and σ is its standard deviation. In order to train the machine learning model, we randomly split the entire dataset into 70% training and 30% testing dataset. The training set contains a set of known outputs and the algorithm learns on this dataset such that it can generalize on the other dataset. During training, the algorithm never runs on the testing dataset. This is the reason; the testing dataset is used to test how well the algorithm generalizes on a new dataset. In this study, we used five typically used supervised machine learning algorithms: random forest, gradient boosting, support vector regression (SVR), multi‐layer perceptron (MLP), and stacking regression for predicting the mean systolic and diastolic BP. We designed the stacking regression model by combining the output of random forest, gradient boosting, support vector regression, and multi‐later perceptron and used ridge regression to compute the final prediction. Each algorithm needs a set of parameters whose values have to be defined before the algorithms starts learning. These parameters are called as hyper‐parameters. The hyper‐parameters of each of the machine learning model were tuned using fivefold cross‐validation grid search. This is a widely used strategy for model selection which allows the researcher to exhaustively search over the specified grid of parameter values. In order to avoid over‐fitting which is a situation where the algorithm fails to predict anything informative on unseen dataset, we performed a fivefold cross‐validation on the training dataset on each model to evaluate their cross‐validation performance. A k‐fold cross‐validation strategy (in our case, k = 5) is an approach where the training dataset is split into k smaller sets. For each fold, the algorithm is trained on the k−1 of the k‐folds and the remaining set is used as validation. Finally, we tested our model on the testing dataset and used the coefficient of determination (R 2) and mean absolute error (MAE) to compare the performance of the models. We used gradient boosting to determine the important features that are helpful in determining systolic and diastolic BP. Our open source analysis was conducted in python version 3.6 (https://www.python.org) using the library Scikit Learn. 5

TABLE 1.

The Variables used in the machine learning methods classified by their category

Body measures DXA measured body composition measures Demographic

Waist circumference (cm)

Arm circumference (cm)

Thigh circumference (cm)

Subscapular skin thickness (mm)

Triceps skin (mm)

Weight (in kg)

BMI (kg/m2)

Pulse rate (/min)

Head fat (gm)

Head lean mass(gm)

Head fat %

Left arm fat %

Left leg fat %

Right arm fat ( gm)

Right leg fat %

Thoracic BMD (bone mineral density)

Lumbar BMD (gm/cm2)

Pelvis BMD (gm/cm2)

Truncal fat mass (gm)

Truncal lean mass (gm)

Total truncal mass (gm)

Truncal fat %

Subtotal lean mass (gm)

Subtotal mass (gm)

Subtotal fat %

Total BMD (gm/cm2)

Total fat mass (gm)

Total fat %

Total lean mass (gm)

Age (in years)

Gender

The findings show us that though age is the most important factor in the determination of BP, body composition metrics are superior to BMI and weight as features of importance. The results of our study are shown in Figure 1. Stacking regression achieved the highest R 2 value and lowest MAE (Mean Absolute Error) for predicting the average systolic and diastolic blood pressure. Our study did not include categorical variables like history of smoking, presence of other factors like anti‐hypertensive drug use, ethnicity, etc that might limit the overall significance. Certain important determinants of cardiovascular risk like pulse pressure could not be used, since they are derived from systolic and diastolic blood pressure. Our study, however, is able to show that DXA measured body composition is a relatively inexpensive tool, associated with a low level of radiation exposure, that might be useful in identifying population at risk for developing hypertension, especially when traditional markers like BMI and waist circumference might not be adequate in capturing the risk.

FIGURE 1.

FIGURE 1

Stacking regression achieves the highest R 2 and lowest MAE for predicting the average systolic and diastolic blood pressure. The top five features of importance in determining average systolic BP were age, waist circumference (in cm), truncal fat, truncal total, and total lean mass (gm). However, the top five features of importance in determining average diastolic BP were age (in years), pulse rate (per minute), truncal fat, waist circumference (in cm), and BMI

CONFLICT OF INTEREST

None.

Nath T, Ahima RS, Santhanam P. DXA measured body composition predicts blood pressure using machine learning methods. J Clin Hypertens. 2020;22:1098–1100. 10.1111/jch.13914

Tanmay Nath and Prasanna Santhanam Contributed equally to this study.

REFERENCES

  • 1. Banegas JR, Ruilope LM, de la Sierra A, et al. Relationship between clinic and ambulatory blood‐pressure measurements and mortality. N Engl J Med. 2018;378(16):1509‐1520. [DOI] [PubMed] [Google Scholar]
  • 2. Brunstrom M, Carlberg B. Association of blood pressure lowering with mortality and cardiovascular disease across blood pressure levels: a systematic review and meta‐analysis. JAMA Intern Med. 2018;178(1):28‐36. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Bundy JD, Li C, Stuchlik P, et al. Systolic blood pressure reduction and risk of cardiovascular disease and mortality: a systematic review and network meta‐analysis. JAMA Cardiol. 2017;2(7):775‐781. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Chen SC, Lo TC, Chang JH, Kuo HW. Ethnic disparities in blood pressure: a population‐based study. J Immigr Minor Health. 2017;19(6):1427‐1433. [DOI] [PubMed] [Google Scholar]
  • 5. Fabian Pedregosa GV, Gramfort A, Michel V, et al. Scikit‐learn: machine learning in python. J Machine Learning Res. 2011;12:2825‐2830. [Google Scholar]

Articles from The Journal of Clinical Hypertension are provided here courtesy of Wiley

RESOURCES