Abstract
Background
Pulmonary hypertension (PH) poses a significant health threat with high morbidity and mortality, necessitating improved diagnostic tools for enhanced management. Current biomarkers for PH lack functionality and comprehensive diagnostic and prognostic capabilities. Therefore, there is a critical need to develop biomarkers that address these gaps in PH diagnostics and prognosis.
Methods
To address this need, we employed a comprehensive metabolomics analysis in 233 blood based samples coupled with machine learning analysis. For functional insights, human pulmonary arteries (PA) of idiopathic pulmonary arterial hypertension (PAH) lungs were investigated and the effect of extrinsic FFAs on human PA endothelial and smooth muscle cells was tested in vitro.
Results
PA of idiopathic PAH lungs showed lipid accumulation and altered expression of lipid homeostasis-related genes. In PA smooth muscle cells, extrinsic FFAs caused excessive proliferation and endothelial barrier dysfunction in PA endothelial cells, both hallmarks of PAH.
In the training cohort of 74 PH patients, 30 disease controls without PH, and 65 healthy controls, diagnostic and prognostic markers were identified and subsequently validated in an independent cohort. Exploratory analysis showed a highly impacted metabolome in PH patients and machine learning confirmed a high diagnostic potential. Fully explainable specific free fatty acid (FFA)/lipid-ratios were derived, providing exceptional diagnostic accuracy with an area under the curve (AUC) of 0.89 in the training and 0.90 in the validation cohort, outperforming machine learning results. These ratios were also prognostic and complemented established clinical prognostic PAH scores (FPHR4p and COMPERA2.0), significantly increasing their hazard ratios (HR) from 2.5 and 3.4 to 4.2 and 6.1, respectively.
Conclusion
In conclusion, our research confirms the significance of lipidomic alterations in PH, introducing innovative diagnostic and prognostic biomarkers. These findings may have the potential to reshape PH management strategies.
Keywords: biomarker, prognosis, pulmonary hypertension, blood-based test, fatty acid to lipid ratio, lipidomics
Introduction
Pulmonary hypertension (PH) affects 1% of the world’s population1,2 and thus represents a significant global health problem. Even mild PH is a strong negative prognosticator in patients with left heart disease3, lung disease4, and pulmonary arterial hypertension (PAH)5,6.
In recent decades, global research efforts have led to targeted therapies for the rare pulmonary vascular diseases PAH and chronic thromboembolic PH (CTEPH). Despite this, the estimated five-year survival rate for newly diagnosed PAH patients has remained at only 61%7,8.
Diagnosis of PH is challenging because measurement of pulmonary arterial pressure (PAP) requires right heart catheterization (RHC), whereas non-invasive methods provide only reliable estimates of PAP1 or are not widely available9. Natriuretic peptide levels (BNP or NT-proBNP) are the only recommended biomarkers for PH, but they are not specific for pulmonary hypertension. Therefore, the development of new diagnostic tools for the detection of PH, risk stratification, and epidemiological studies remains an important issue2.
Fibroproliferative remodelling of distal pulmonary arterioles drives elevation in pulmonary vascular resistance and pulmonary arterial pressure. This is associated with unique metabolic changes as detected from the circulation10–15 as a reflection of the profound changes in the cells and matrix of the right ventricle and pulmonary vessel walls16. Our hypothesis was that PH patients may present with a characteristic metabolic profile that allows for detection of PH and risk stratification1. We identified lipidomic changes in the small pulmonary arteries (PA) of IPAH patients and a unique lipidomic profile in a diverse PH cohort, allowing identification of PH among healthy and diseased controls, which was confirmed in an independent validation cohort. In addition, we show that simple markers of this lipidomic profile are significantly associated with survival and improve the accuracy of two established prognostic PAH scores.
Results
Clinical and cardiopulmonary hemodynamic characteristics for the study cohorts
Our study included three classes of subjects: PH patients, healthy control subjects (HC) and lung disease controls (DC) without PH, all of whom underwent blood-based high resolution mass spectrometry (HRMS)-based metabolomics (see Fig. 1). PH patients had a mean PAP (mPAP) ≥ 25 mmHg and were either 1) PAH, 2) PH due to left heart disease (LV), 3) PH due to lung disease, or 4) CTEPH. All DC had mPAP < 25 mmHg with either metabolic syndrome, chronic obstructive pulmonary disease (COPD) or interstitial lung disease (ILD). The distributions in sex and body mass index (BMI) between PH and HC/DC were similar (see Fig. 1C). Age was in the same range, although controls tended to be younger. Metabolites were not significantly correlated with age (see Fig. S3).
The training cohort consisted of PH patients, DC and HC, all sampled in Graz, Austria. The external validation cohort contained PH patients and HC from Zürich, Switzerland and Regensburg, Germany. Patient characteristics are summarized in Table 1.
Table 1:
Training (n=169) | Validation (n=64) | ||||
---|---|---|---|---|---|
HC (n=65) | DC (n=30) | PH (n=74) | HC (n=7) | PH (n=57) | |
Age, years | 58.0±2.8 | 59.5±4.7 | 66.5±3.0 | 42.0±13.8 | 57.0±3.7 |
Female:male (ratio) | 44:21 (2.1:1) | 20:10 (2.0:1) | 50:24 (2.1:1) | 5:2 (2.5:1) | 43:14 (3.1:1) |
BMI (kg/m2) | 24.1±0.9 | 26.2±3.0 | 25.9±1.1 | 21.0±1.7 | 26.3±1.4 |
Diagnosis (since years) | - | 8±1.8 | 1.0±1.2 | - | 5.5±1.7 |
Pulmonary hemodynamics from RHC | |||||
mPAP (mmHg) mean pulmonary arterial pressure | - | - | 42.0±2.5 | - | 50.0±3.9 |
PAWP (mmHg) pulmonary arterial wedge pressure | - | - | 10.0±1.3 | - | 9.0±0.9 |
CO (L/min) cardiac output | - | - | 4.3±0.4 | - | 4.2±0.4 |
CI (L/min/m2) cardiac index | - | - | 2.6±0.2 | - | 2.4±0.2 |
PVR (WU) pulmonary vascular resistance | - | - | 7.1±1.0 | - | 9.1±1.7 |
RAP (mmHg) right atrial pressure | - | - | 6.0±1.2 | - | 7.5±1.1 |
Clinical data | |||||
6MWD (m) 6-min walk distance | - | 454±41 | 330±34 | - | 390±31 |
WHO FC world health organisation functional class | - | 2.0±0.2 | 3.0±0.2 | - | 2.0±0.2 |
FEV1 (% predicted) forced expiratory volume 1 s | - | 51.5±9.9 | 75.0±5.6 | - | 82.2±3.0 |
FVC (% predicted) forced vital capacity | - | 63.8±6.7 | 82.8±5.5 | - | 91.4±4.0 |
FEV1/FVC (% predicted) | - | 56.3±8.5 | 74.4±3.0 | - | 75.7±2.1 |
TLC (% predicted) total lung capacity | - | 103.0±10.8 | 94.0±3.9 | - | 96.5±3.5 |
DLCO cSB (% predicted) single-breath CO diffusing capacity, haemoglobin corrected | - | 49.6±7.8 | 63.8±6.0 | - | 53.8±3.9 |
DLCO cVA (% predicted) diffusing capacity for CO alveolar volume, haemoglobin corrected | - | 70.0±7.9 | 69.6±6.1 | - | 62.0±3.7 |
RDW (%) red cell distribution width | - | 14.0±0.7 | 15.4±0.7 | - | 14.7±0.6 |
NT-proBNP (pg/mL) | - | 98±39 | 869±806 | - | 751±475 |
Uric acid (mg/dL) | - | 5.0±0.5 | 6.0±0.6 | - | 6.6±0.7 |
Creatinine (mg/dL) | - | 0.80±0.09 | 1.02±0.12 | - | 1.00±0.23 |
Bilirubin total (mg/dL) | - | 0.40±0.13 | 0.64±0.13 | - | 0.65±0.16 |
Identification of a characteristic lipidomics profile in pulmonary hypertension
The metabolome of patients was assessed with untargeted hydrophilic interaction liquid chromatography (HILIC)-HRMS from serum, EDTA, and heparin plasma samples in four measurement runs. As is typical with mass spectrometry (MS) based metabolomics methods, notable batch effects occurred between runs as well as drift within the longer runs (see Fig. S2). Drift correction was successfully performed (see Fig. S2) using quality control (QC) injections, which are a mix of equal sample volumes repeatedly measured to monitor instrument stability.
In total, 164 known metabolites were of consistent analytical quality suitable for multivariate and univariate exploratory analysis. Global metabolic changes were first examined using the unsupervised multivariate independent principal component analysis (iPCA). The metabolomes of the PH patients differed from the control groups (HC and DC), which was visible as a clear group separation in the iPCA scores plot along the first component (x-axis, Fig. 2A). The observed metabolic difference was strongly driven by an increase in specific free fatty acids (FFA) in PH patients (Fig. 2B). The machine learning method orthogonal projections to latent structures discriminant analysis (OPLS-DA) confirmed that the observed global metabolic differences between PH and HC/DC were significant (p < 0.001, cross-validation and 1000 random permutations, Fig. 2C).
The univariate statistical analysis confirmed that FFAs were strongly and significantly increased in PH as compared with HC/DC (Fig. 2D). The metabolites from routine clinical chemistry, e.g. uric acid, were strongly correlated with their respective HILIC-HRMS metabolites (Fig. S3). Single FFAs and lipids were not strongly correlated with any clinical parameter, suggesting that the detected FFA changes may be independent from conventional clinical assessment.
FFA/lipid-ratios diagnose pulmonary hypertension
The potential of using our metabolites to predict PH was first investigated with the machine learning method random forest (RF)17,18 and extreme gradient boosting (XGBoost)19. From the 164 metabolites, 11 were excluded from biomarker analysis because of low signal intensities and high noise (see Supplementary Data 1). RF and XGBoost both achieved an area under the curve (AUC) of 0.82 in the receiver operator curve (ROC) analysis in the validation set (Fig. 3A). The drift correction used here allowed a joint exploratory statistical analysis, but drift correction is impossible in future routine clinical diagnostics. Therefore, the diagnostic and prognostic performance was also tested without drift correction. RF and XGBoost performance were almost identical without drift correction, indicating that their nonlinear algorithms exhibit intrinsic drift-handling capabilities (Fig. 3A).
Other important model performance parameters such as specificity, sensitivity, and balanced accuracy were comparable for RF and XGBoost irrespective of drift correction. For RF and XGBoost, the average balanced accuracy was 72.2% and 72.7%, respectively, the specificity was 88.6% and 91.2%, respectively, and the sensitivity was 55.8% and 54.1%, respectively (Fig. 3C). The validation cohort had only seven HC and no DC with a slightly different age and BMI distribution than the training cohorts. To overcome this limitation, we tested an artificial data split into 70% training and 30% validation sets with balanced distributions in age, BMI, sex and disease class (PH/DC/HC). The performance of RF and XGBoost remained similar to the original split by center (Fig. S4).
Despite their diagnostic potential, both machine learning approaches are not suitable for routine clinical diagnostics because they are labour intensive and not fully explainable. In addition, both approaches showed notable decreases in AUC, sensitivity, specificity and balanced accuracy from the training to the validation cohort. Therefore, we tested whether ratios formed from lipophilic metabolites with strong effects of PH versus HC/DC, normalized to metabolites with no effects of PH vs. HC/DC, could replace machine learning approaches to create an explainable, easy-to-measure marker. In PH, many FFAs were strongly increased while many complex lipids were unchanged (Fig. 2D), offering the option to achieve markers of PH that are easy to measure. Thus, characteristic FFAs were selected into the numerator, based on their analytical performance in PH versus HC/DC and lipids with good analytical performance and non-significance in PH vs. HC/DC were chosen for the denominator.
For the nominator, 11 FFA were considered: FFA C15:0 (pentadecylic acid), FFA C16:2 (palmitolinoleic acid), FFA C16:1 (palmitoleic acid), FFA C17:1 (heptadecenoic acid), FFA C17:0 (margaric acid), FFA C18:3 (α-linolenic acid, ALA, or γ-linolenic acid, GLA), FFA C18:2 (linoleic acid, LA), FFA C18:1 (oleic acid), FFA C19:1 (nonadecenoic acid), FFA C20:5 (eicosapentanoic acid, EPA) and FFA C20:1 (eicosenoic acid). Eight lipids, the best two from four common classes, were considered for the denominator: lysophosphatidylcholine (LPC) 18:2, LPC 18:1, phosphatidylcholine (PC 36:4, PC 38:6), sphingomyelin (SM 34:2, SM 36:2), lysophosphatidylethanolamine (LPE) 16:0, and LPE 18:1. To stabilize the ratios, we also tested sums of up to six FFAs in the numerator and up to four lipids in the denominator, limiting the maximum number of individual metabolites in a given ratio to ten.
In total, about a quarter of a million FFA/lipid-ratios were evaluated for their diagnostic performance in ROC analysis. Most FFA/lipid-ratios achieved moderate to high performance with AUCs above 0.8 (Fig. S5A-I,K,L) irrespective of drift correction or split. There may be a concern that combining multiple metabolites in a ratio increases technical error and impairs reproducibility. However, in our cohort, the calculated technical variability according to the laws of error propagation was <5% for most FFA/lipid-ratios and <10% for all others (Fig. S5J).
For simplicity, the top three ratios were selected, based on their AUC, sensitivity, and specificity, and named RATIO1, RATIO2, and RATIO3 (Fig. S5M). These ratios had six metabolites in common: numerator (FFA C20:1 + FFA C16:1 + FFA C15:0); denominator (PC 38:4 + PC 36:4 + LPC 18:1). Compared to RATIO 1, the only difference of RATIO 2 and 3 was the addition of two more FFA to the numerator sum (FFA 18:3 + FFA 17:0 and FFA 18:2 + FFA 17:0, Fig. 3B). The ROC curves of RATIO1–3 overlapped well between training and validation cohort, irrespective of drift correction (Fig. 3B). All other important performance parameters were also very similar between RATIO1–3 (Fig. 3C). As a general result, top ratios outperformed the machine learning models, irrespective of drift correction or data split, especially in terms of sensitivity. The top three ratios’ average balanced accuracy was 85.5%, with 89.7% specificity and 81.4% sensitivity (Fig. 3C).
FFA/lipid-ratios massively reduced technical complexity compared to broad metabolomics runs while achieving better diagnostic performance than machine learning approaches. The diagnostic performance of the FFA/lipid-ratios was independent of drift correction, making them suitable for stand-alone measurements. Accordingly, we assume that FFA/lipid-ratios are suitable for future routine applications.
Specific FFA/lipid-ratios predict survival
The prognostic value of RATIO1 was compared with well-established prognostic PAH scores, FPHR4P20 (based on WHO FC, 6MWD, RAP, and CI) and COMPERA2.01,21 (based on WHO FC, 6MWD, and NT-pro-BNP). Survival and hazard ratios (HR) were investigated for all our patients with heart or lung disease and PH. For direct comparability of Kaplan-Meier survival curves and HRs, all numeric values were categorized into low or high risk according to their optimal cut-off points (e.g. 57 years for age).
We analysed survival times since enrolment (= baseline) which were available for 129 PH and 21 DC (13 COPD, 8 ILD) patients. The COMPERA2.0 score was available for 122 PH and 11 DC patients, and FPHR4p for 97 patients (93 PH and 4 DC). Survival times and both scores were available for 91 PH patients. As expected, FPHR4p and COMPERA2.0 scores were significantly associated with survival time (Fig. 4). RATIO1 was also significantly associated with survival (Fig. 4) with a similar HR as COMPERA2.0 scores. When RATIO1 was combined with each of the scores (RATIO1 and score equally weighted), the prognostic value of the respective score improved notably (Fig. 4). This indicates that our simple metabolomic marker provided independent prognostic information and would complement established prognostic scores.
As expected, established prognostic risk factors from the literature, higher WHO FC, lower 6MWD, and higher NT-proBNP were associated with poorer survival (Fig. S7). Next, we examined the major potentially confounding factors age, sex and BMI. Age > 57 yr constituted a considerable risk factor, while BMI>26.8 kg/m2 and male sex were not significant (Fig. S6). Results were similar in the joint model for all three factors, suggesting that only age was a relevant confounder. Therefore, we included only age as covariate to our HR analysis (Fig. 4A).
Overall, the HR results with age as covariate were similar to those without (Fig. 4A) and although age correction caused a decrease of all HR values, their respective prognostic impact remained significant.
Changes in lipid metabolism in pulmonary arteries of idiopathic PAH (IPAH) patients
We investigated small PA from IPAH patients, the prototype of PAH and healthy donor lungs from explanted lungs. Oil red O staining and co-staining with markers of endothelial and smooth muscle cells showed accumulation of lipids in several IPAH PA (Fig. 5A). The observed lipid deposition could be the result of increased fatty acid uptake, metabolic dysregulation, or increased lipid synthesis. Therefore, we performed laser-capture microdissection of small PA (< 500 μm) from IPAH patients and healthy donors and examined gene expression of transporters and enzymes related to lipid metabolism. Gene expression showed significant upregulation of several key genes involved in lipid uptake and metabolism in IPAH (Fig. 5B). Most striking was the significant upregulation of SLC27A5, GAPT1, AGAPT1, Lipin2 and DGAT1. DGAT1 plays a critical role in lipid droplet formation. Up-regulation of the FFA transport protein SLC27A5 indicates increased uptake of FFAs from the circulation. GPAT, AGPAT, and lipin family enzymes promote triglyceride biosynthesis, incorporation of exogenous fatty acids into triacylglycerides (TAG) and phospholipids, as well as β-oxidation. The upregulation of these genes in the small PAs of IPAH patients might be caused by the increased circulating FFA levels or might be a manifestation of an underlying disease mechanism.
Next, we mimicked elevated circulating FFA levels by treating primary human pulmonary artery smooth muscle cells (hPASMC) and human pulmonary artery endothelial cells (hPAEC) from healthy donors with a FFA cocktail. Bodypi fluorescence staining showed accumulation of fat in hPASMCs and hPAECs (Fig. 6A). To better understand the functional effects of this, we performed in-vitro studies with primary hPAEC and hPASMCs. In hPASMCs, the treatment with the FFA mixture significantly promoted cell proliferation (Fig. 6B), and in hPAECs it significantly decreased acetylcholine (ACh)-induced NO secretion, suggesting endothelial dysfunction (Fig. 6C).
We investigated the effect of the FFA mixture on endothelial barrier function by determining the magnitude of thrombin-induced endothelial barrier dysfunction. Fig. 6D shows the typical response of control endothelium to thrombin. When endothelial monolayers were pre-treated with FFA, they exhibited a significantly pronounced decrease in transendothelial electrical resistance (TEER) and delayed recovery of barrier function compared to control media (Fig. 6D). This suggests that FFA treatment causes profound endothelial dysfunction. Finally, we examined FFA-induced metabolic responses in hPASMCs and hPAECs by means of the Seahorse method and found significantly decreased coupling efficiency in both cell types in response to FFA treatment (Fig. 6E, F). Moreover, FFA exposure decreased non-mitochondrial respiration and ATP production in hPAEC and increased proton leak in hPASMC. This suggests that FFA treatment changes the phenotype of hPASMCs and hPAECs from healthy donors into an IPAH phenotype and that high levels of circulating FFAs may cause pulmonary vascular dysfunction, representing either a primary cause or a novel vicious circle in PH.
Discussion
The uptake and metabolism of long-chain fatty acids is critical for many physiological and cellular processes, and cellular accumulation may cause numerous pathological and functional changes. Previous investigations in PAH have shown severe metabolic changes of the right ventricle and elevated in vivo myocardial triglyceride content10,22,23. Our investigations add important information by showing that small PA of IPAH patients are affected by lipid accumulation. Interestingly, we found increased gene expression of enzymes causing fatty acid uptake and triglyceride biosynthesis in smooth muscle cells of IPAH patients (Fig. 5C). This could be caused by the high FFA levels in the circulation, however, it could also represent a change in the cell physiology that strongly contributes to the development of PH.
For the first time, we have explored the effects of FFA exposure in primary human hPAECs and hPASMCs. FFA exposure decreased NO secretion and impaired barrier function in hPAECs, and caused increased proliferation in hPASMC. In addition, FFA exposure induced changes in non-mitochondrial respiration and coupling efficiency in both cell types. This suggests that impaired lipid handling in IPAH PAs might trigger the remodelling in PAH. This is in line with numerous studies indicating that the expression of GAPT1 / AGPAT1 / lipin-1 has important metabolic consequences24–27. Our data, taken in context with data from the literature, may suggest that in PH, the failing right ventricle is no longer able to cope with FFA metabolism, leading to an increase in circulating FFA levels, which negatively affects the pulmonary vessels. However, it is also possible that there are primary changes in the lipid metabolism of the small PAs, leading to vascular dysfunction, subsequently increasing right ventricular afterload, initiating a vicious circle that finally causes PH and right heart failure.
Spanning over two decades, extensive basic, translational, and clinical analyses have supported a causative link between metabolic reprogramming and PAH28,29. Similar to the Warburg effect in cancer, a shift from mitochondrial oxidation to glycolysis appears to occur in the right ventricle of PAH patients22,23. In this study, we show that the small PAs are affected by significant metabolic changes. Taking cues from cancer, recent data demonstrate significant alterations in metabolic programs other than glycolysis and glucose oxidation, including the pentose phosphate pathway (PPP), glutaminolysis, lipolysis, fatty acid synthesis and oxidation and changes in the plasma proteome 10–16,30–32. However, it remains unclear whether these changes originate in the overloaded right ventricle, in the primarily affected pulmonary vessels, or elsewhere.
Although there has been tremendous progress in the understanding of PAH in recent decades, there is still an unmet need for diagnostics and therapy. According to a recent literature review, the five-year survival rate for newly diagnosed patients has not significantly improved despite a multitude of new PAH medications8. This may be due to the fact that the metabolic mechanisms of the disease have not been addressed in detail. To our knowledge, this study is the first to apply unsupervised, broad metabolome analysis of PH patients of group 1 to 4. The generated specific FFA/lipid-ratios identified patients with PH, independent of comorbidities. The same ratios provided prognostic information, complementing existing clinical prognostic scores. The diagnostic and prognostic results were validated in an independent international cohort and this was confirmed by a balanced split group approach. Of note, our in vitro mechanistic studies suggest that disturbed lipid metabolism may significantly contribute to the pathologic mechanisms in IPAH patients. Our simple FFA/lipid-ratios might be useful in the diagnosis and clinical management of PH patients and might even serve as surrogate endpoints in future clinical trials.
Classical machine learning using RF and XGboost showed that metabolic differences hold a high potential for diagnostic biomarkers. Both approaches were able to overcome typical technical MS-specific problems such as batch effects and intensity jumps between measurements, which usually require drift correction, suggesting that such labour-intensive procedures are not necessary if PH is to be detected. The same is true for our easily applicable FFA/lipid-ratios that performed comparably well with drift-corrected and non-corrected metabolomics data, suggesting that forming such a ratio corrects for batch effects and drifts in the metabolomics dataset as well as for inter-individual variability of patients’ lipid metabolism and lifestyle.
Our FFA/lipid-ratios performed very well in both PH diagnosis and survival prediction. Their diagnostic performance even outperformed RF and XGBoost models, especially in terms of sensitivity. In addition, the results were stable despite the use of different sample types such as serum, heparin and EDTA plasma from three different centers, a very important prerequisite for broad applicability in routine diagnostics. The performance was also stable when training and validation cohort were not split by center but 70% to 30% balanced by age, BMI, sex and class. The FFA/lipid-ratios are easy to measure and fully explainable compared to machine learning models, which is advantageous for future studies and regulatory approval processes for in vitro diagnostics (IVD).
Survival prediction is an important tool in the management of PAH patients. Several clinical scores have been established, with FPHR4P and COMPERA2.0 representing most recent developments derived from large databases20,21. The FPHR4P20 score estimates prognosis based on WHO FC, 6MWD, CI, and RAP, while COMPERA2.021 is based on non-invasive parameters, only (WHO FC, 6MWD, NT-pro-BNP). RATIO1 showed an age-dependency that was comparable to both clinical scores. Most importantly, both clinical scores gained notably in predictive power when combined with RATIO1. This suggests that FFA/lipid-ratios represent an independent, non-invasive prognostic factor that combines favourably with established prognostic PH scores.
Strengths and limitations
Strengths of this study include exploration of primary IPAH small PA vessels, with mechanistic insight in the effects of FFA on PASMC and PAEC, a broad metabolomics approach, sampling and processing conditions suitable for routine clinical practice, inclusion of a disease control group, comprehensive clinical assessment, use of machine learning, and development of diagnostic and predictive FFA/lipid-ratios. As a limitation, we had access to a small number of patients and controls. This may have been compensated by a profound clinical characterization of the patients, including RHC, coupled with long follow-up times for survival analysis and the 70% to 30% balanced split test, confirming the results. Another limitation is that FPHR4p and COMPERA2.0 have been derived from PAH patients while we used them for all available patients including PH associated with left heart and lung diseases. This may have introduced a bias into the prognostic performance of these scores, however, this bias relates to the scores and our new markers in the same way. It may be seen as a limitation, that blood samples were collected along with clinical routine blood draws, without standardized fasting or other control measures, however, this may also represent a strength of our study as it suggests robustness of the results. All metabolic measurements were based on high-resolution mass spectrometry, a very sensitive and exact method, yet slow, expensive and work-intensive. However, our FFA/lipid-ratios allow for a simplified approach that is easily available.
Outlook
Future studies including larger numbers of patients with a balanced group distribution and more patients with early pulmonary hypertension and less impaired right ventricular function and longitudinal studies are warranted to investigate the value of metabolic markers for patient management.
Conclusions
Based on our mechanistic insights into the metabolic changes in small PA, and our machine learning approaches, our FFA/lipid-ratios identified PH patients with a high accuracy and were significantly associated with prognosis. This may point to novel diagnostic tools and possibly also to new therapeutic targets. If implemented into the management strategy for PAH patients, this might inform therapy decisions to improve outcomes of PAH therapy.
Online Methods
Cohort Data Sources
Results from the Graz Pulmonary Hypertension Registry (GRAPH) have been reported previously1,2. Briefly, the program uses a software application linked to the electronic health record for documentation of all patients of the Division of Pulmonology at the Department of Internal Medicine of the Medical University of Graz who gave written informed consent. Demographic, clinical, echocardiographic, procedural, and hemodynamic data and blood samples are collected and tracked for longitudinal outcomes. All hemodynamics were measured in a standardized fashion by the same experienced team1. Regularly scheduled quality checks of the registry are performed to ensure completeness and accuracy. Written informed consent was obtained from all patients and the study was conducted in line with the Helsinki declaration. The study was approved by the institutional ethics board (identifier: 23-408ex10/11), and the study has been registered at ClinicalTrials.gov (NCT01607502).
The BioPersMed (Biomarkers of Personalised Medicine) project is designed as a single-centre, prospective, observational cardiovascular risk study. Between 2010 and 2016, 1022 community dwelling and asymptomatic individuals were regionally recruited and assessed biannually including a standardized biospecimen acquisition3. Written informed consent was obtained from all patients and the study was conducted in line with the Helsinki declaration.
Cohort Study Population
We retrospectively enrolled two consecutive cohorts of patients identified through our single-center GRAPH registry and labelled them as GRAPH-Metabolomics (GRAPH-M).
The inclusion criteria for the first cohort were diagnosis of idiopathic pulmonary arterial hypertension (IPAH) and absence of severe co-morbidities. Healthy sex- and age-matched subjects served as a healthy controls (HC).
Inclusion criteria for the second cohort consisted of the following PH groups 1 – 4: 1) PAH, or 2) PH associated with heart disease, 3) PH associated with lung diseases (COPD, ILD), or 4) CTEPH. The cohort of disease controls (DC) consisted of patients with airway or parenchymal lung disease or patients with metabolic syndrome (hypertension, hypercholesterinemia and type 2 diabetes mellitus) but no signs of elevated PAP. Healthy sex- and age-matched subjects served as a control group (HC). The patients with the metabolic syndrome were selected from the BioPersMed cohort Graz.
All cohorts with the exception of patients with metabolic syndrome or HC underwent RHC. In patients who underwent multiple RHCs, the first RHC was considered the index procedure and was the only one included in the analysis. Patients were included in the analyses if data from a complete RHC were available, including a resting value for mPAP, PAWP, CO, heart rate, systolic and diastolic PA pressure, PVR and mixed venous oxygen saturation (SvO2). We used the standard equation to calculate the pulmonary vascular resistance with PVR = (mPAP - PAWP)/CO expressed in WU.
Validation Cohort
This cohort comprised an international multicentric patient cohort. The inclusion criteria for the validation cohort were the confirmed diagnosis of PAH and informed written consent at the home institution. All participating centers were experienced centers of excellence for PH and all patients underwent RHC in a standardized manner. Demographic, clinical, echocardiographic, procedural, and hemodynamic data and blood samples were available as anonymized data. The sample collection was approved by the local Ethics Committee in each local center (Regensburg University, ethics committee No. 08/090 and cantonal ethical review board Zürich KEK 2010–0129; 2014–0214; 2017–0476).
Cohort Outcomes
The primary outcome was the confirmation of PH, defined as mPAP ≥ 25 mmHg according to the ERS/ESC guidelines from 20154. The secondary endpoints included time to all-cause mortality with data provided by Statistic Austria (single-centre registry Graz) and by the respective centers who contributed to the validation cohort. A complete list of covariates analyzed in this study is provided in the Supplementary Material (Fig. S3).
Human lung tissue samples
Human lung tissue samples were obtained from patients with IPAH who underwent lung transplantation at the Department of Surgery, Division of Thoracic Surgery, Medical University of Vienna, Vienna, Austria. The protocol and tissue usage were approved by the institutional ethics committee (976/2010) and written patient consent was obtained before lung transplantation. The patient characteristics included: age at the time of the transplantation, weight, height, sex, mPAP measured by RHC, pulmonary function tests, as well as the medical therapy. The chest computed tomography scans and RHC data were reviewed by experienced pathologists and pulmonologists to verify the diagnosis. Healthy donor lung tissue was obtained from the same source. Donor/IPAH patient characteristics are given in Supplementary Material (Table S2)Fehler! Verweisquelle konnte nicht gefunden werden..
Lung histological Oil Red O staining
The lipid accumulation of lung tissues was visualized by Oil Red O staining (Merck KGaA, Darmstadt, Germany). Lung tissue was embedded in tissue freezing medium, were snap frozen at −80 °C and sliced using a Leica CM 1900 cryostat (Leica Biosystems GmbH, Wetzlar, Germany) at a thickness of 5 μm per section. The slices were stained with Oil Red O working solution for 10 min, differentiated in isopropanol for 5 min, and then washed with water at room temperature (20 – 25 °C) (RT). The experiments were finished according to the manufacturer's instructions. The morphological features of the tissues were assessed by hematoxylin-eosin (H&E) staining. The lipid in the tissue observed by microscope.
Laser capture microdissection of PA and RNA extraction
Laser capture microdissection (LCM) of 10 donor lungs and 10 lungs from IPAH patients, as well as mRNA isolation and cDNA synthesis were performed as previously described5. The intima and media layers of PAs of 100 – 500 μm diameter were selected, marked and isolated with the Arcturus® LCM System. Captured vessels were immediately transferred into RNA lysis buffer and were snap frozen. RNeasy Micro Kit was utilized to isolate RNA (RNeasy Micro Kit, Qiagen, Hilden, Germany)6.
qRT-PCR - laser capture microdissected human PA
The expression of enzymes and transporters was analyzed with real-time quantitative (qRT)-PCR using the QuantiFast SYBR PCR reagent (Qiagen, Hilden, Germany) according to Papp et al. 20197. Primer pairs (Eurofins, Graz, Austria), summarized in Supplementary Material (Table S3), were designed to span at least one exon-exon boundary to avoid the amplification of genomic DNA. The specificity of all primers, as well as the length of the amplicon, were confirmed by melting curve analysis and by running the products on 2% agarose gels, respectively.
Cell Isolation and culture
hPAECs
hPAECs were either purchased from Lonza or isolated from donor lungs. For the isolation of donor hPAECs, PA (< 2 mm in diameter) were isolated and the endothelium incubated with an enzymatic mixture of collagenase, DNAse and dispaze in HBSS at RT8. Cell suspension was collected, resuspended in VascuLife Complete SMC Medium and cultured in gelatin-coated T25 flasks at 37°C and 5% CO2. After reaching 70 – 80% confluency, cells were trypsinized, enriched by 3 consecutive steps of CD31-selective magnetic-activated cell sorting technology and verified via morphological and marker confirmation (smooth muscle actin SMA, fibronectin, vimentin, von-Willebrand Factor VWF, smooth muscle myosin heavy chain and CD31). Surplus hPAECs were frozen (endothelial cell complete medium containing 12% FCS and 10% DMSO) and stored in liquid nitrogen until further use. Passages 2–6 were used for the experiments. Detailed patient characteristics of isolated hPAECs can be found in Supplementary Material (Table S4).
hPASMCs
The isolation and culture of human hPASMCs was performed as previously reported9. After the removal of the endothelial cell layer, the media was peeled away from the underlying adventitial layer and cut into approximately 1–2 mm2 sections, centrifuged and resuspended in VascuLife Complete SMC Medium supplemented with 20% FCS and 0.2% antibiotics, then transferred to T75 flasks and cultured at 37°C and 5% CO2. After confluency of hPASMC was formed, the cells were trypsinized and either cultured in VascuLife Complete SMC medium supplemented with 10% FCS and 0.2% antibiotics, or frozen (VascuLife Complete SMC Medium containing 15% FCS and 10% DMSO) and stored in liquid nitrogen until further use. Passages 4–8 were used for the experiments. SMCs were verified via morphological and marker confirmation (smooth muscle actin SMA, fibronectin, vimentin, von-Willebrand Factor VWF, smooth muscle myosin heavy chain and CD31). Detailed patient characteristics of isolated hPASMCs can be found in Supplementary Material (Table S4).
Lipid visualized by bodipy staining
BODIPY (D3922, Invitrogen, Carlsbad, Calif, USA) (excitation wavelength 493 nm, emission maximum 503 nm), was diluted in phosphate-buffered saline (PBS, 137 mM NaCl, 2.7 mM KCl, 12 mM HPO42−/H2PO4−, pH 7.4) or DMSO at a concentration of 1 mg/mL and applied to the hPASMCs and hPAECs for 20 mins at RT. Fixed cells (4% paraformaldehyde at 37°C for 5 mins) were used. Following fixation, samples were washed 3 times in PBS for 10 min. Sections were counterstained with 4′,6-diamidino-2-phenylindole dihydrochloride (DAPI; Sigma-Aldrich) to visualize nuclei and covered with glass cover slips. Images were taken using a laser scanning confocal microscope (Zeiss LMS 510 META; Zeiss, Jena, Germany) with Plan-Neofluar (×40 /1.3 Oil DIC) objective.
Measuring cell metabolic state
Oxygen consumption rate (OCR) and extracellular acidification rate (ECAR) were determined by the Seahorse XFp analyzer (Agilent, USA)10. hPASMCs or hPAECs were plated onto cell culture microplates on the day before the experiments and treated with 0.25mM FFA (a mix of oleate, FFA 16:0 and FFA C18 (2:1:1)) in VascuLife® complete Medium and incubated for 24 h. Cells were then incubated in XF assay medium (Agilent), supplemented with 25 mmol/L glucose and 1 mmol/L pyruvate (hPASMCs) or 10 mmol/L glucose, 1 mmol/L pyruvate, and 2 mmol/L L-glutamine (hPAECs) for 1 h at RT before the measurement. After the recording of the basal rates of OCR and ECAR, final concentrations of 1 μmol/L oligomycin, 1 μmol/L carbonyl cyanide-4 (trifluoromethoxy) phenylhydrazone, and 0.5 μmol/L rotenone and antimycin A for human hPASMCs; 1 μmol/L oligomycin, 1 μmol/L carbonyl cyanide-4 (trifluoromethoxy) phenylhydrazone, and 1 μmol/L rotenone and antimycin A for human hPAECs were added (XF Cell Mito Stress Test Kit, Agilent) through the instrument’s injection ports to obtain proton leak, maximal respiratory capacity, and nonmitochondrial respiration, respectively. Glycolytic capacity was measured using an XF Glycolysis Stress Test Kit (Seahorse Bioscience). ECAR was determined after serial injection with 10 mmol/L D-glucose, 1 μmol/L oligomycin, and 100 mmol/L 2-deoxyglucose. All the assays were performed in triplicate and normalized to protein content.
Endothelial Barrier Function
TEER served as an indicator of barrier function of endothelial cell monolayers. TEER was determined using an electrical cell-substrate impedance sensor (ECIS) (Applied Biophysics, Troy, NY, USA). Briefly, the endothelial cells are seeded in complete medium into (8W10E -PET arrays Applied Biophysics, NY, USA) each well and allowed them to grow until they reached confluence. The FFA (0.25 mM, a combination of FFA 18:1, FFA 16:0 and FFA 18:0 (2:1:1)) was applied for 24 hours before the barrier disruption was initiated by addition of recombinant human thrombin.
Proliferation
To investigate the proliferative effect of FFA treatment on hPASMCs, the following protocol was applied11: 10 000 hPASMCs were seeded in 96-well plates; the following day the cells were starved (VascuLife® Basal Medium, 0% FCS, 0.2% antibiotic/antimycotic) or kept under control conditions (VascuLife® Basal Medium with 5% FCS; LifeLine Technology, Walkersville) for 12 h. Afterwards, platelet-derived growth factor (PDGF)-BB was added and the proliferation of hPASMCs was determined by [3H] thymidine (BIOTREND Chemikalien GmbH) incorporation, after 24 h of incubation, as an index of DNA synthesis and measured as radioactivity by scintillation counting (Wallac 1450 MicroBeta TriLux Liquid Scintillation Counter and Luminometer). To investigate the effect of FFA (0.25 mM, a combination of oleate, FFA 16:0 and FFA 18:0 (2:1:1)) on hPASMCs, the same number of cells was seeded and after 12 h of starvation, FFA and vehicle were added and the proliferation of hPASMCs was determined as aforementioned. All experiments were performed in quadruplicate.
DAF-DM-mediated nitric oxide measurement
Measurements were performed as previously described8. hPAECs were seeded in gelatin-coated dark 96-well plates, starved for 2 h with Ringer’s solution and loaded with 10 μm 4-Amino-5-Methylamino-2′,7′-Difluorofluorescein Diacetate (DAF-FM) for 30 min at 37°C. The cells were stimulated with 5 μM acetylcholine (ACh) for the induction of nitric oxide measurement on CLARIOstar Plus (BMG Labtech, Ortenberg, Germany) at Ex/Em = 495/515 nm. All the assays were performed in quadruplicate and normalized to protein content.
Plasma liquid chromatography–mass spectrometry metabolomics
Metabolites were analysed by targeted hydrophilic interaction liquid chromatography–high resolution mass spectrometry (HILIC-HRMS) metabolomics according to Bajad et al.12 and samples were processed according to Yuan et al.13 as described previously14,15.
Samples from Graz were aliquoted and stored at the Biobank Graz. On the day of the processing they were thawed in water ice bath on a slow rotary shaker (300 rpm) in < 10 min and vortexed shortly. Aliquots of 100 μl were precipitated in LoBind Eppendorf tubes with 400 μl cold methanol (−80°C for at least 4 h, kept on dry ice) and vortexed shortly. After the overnight precipitation at –80 °C the samples were centrifuged for 10 min at 14.000 g at 4°C and supernatants transferred to fresh LoBind Eppendorf tubes. Supernatants were dried under nitrogen flow and stored at –80°C until all batches of the cohort were finished. Extracts were reconstituted in 100 μl 30% methanol/H2O, vortexed for 45 s and centrifuged for 5 min at 14.000 g at 4°C. The supernatant was transferred into autosampler vials, and equal aliquots from all samples were pooled for quality control (QC). All ready-to-measure extracts were refrozen at –80°C prior to measurement. Every 24 h samples were freshly thawed at RT, vortexed, spun down and added to the autosampler at 4°C.
Measurements were made in independent runs per cohort with samples in randomized order, interspaced by according blanks, pooled QC samples and UltimateMix (UM, described previously16). Pooled QC samples were generated independently for cohort 1 and 2, while QC was mixed for cohort 3 and 4. Cohorts 1 and 2 were extracted and measured in 1 batch, while cohorts 3 and 4 were randomly divided into 6 batches for sample extraction and measured with daily thawed extracts to reduce metabolite degradation.
Extracts were measured with a Dionex Ultimate 3000 high-performance liquid chromatography (HPLC) setup (Thermo Fisher Scientific, USA) equipped with a NH2-Luna HILIC analytical column and crudcatcher with an injection volume of 10 μl and a 37 min gradient from aqueous acetonitrile solution [(5% acetonitrile v/v), 20 mM ammonium acetate, 20 mM ammonium hydroxide, pH 9.45] as eluent A (LMA) to acetonitrile as eluent B (LMB). Mass spectrometric detection was carried out with a Q-Exactive™ system (Thermo Fisher Scientific). Electrospray ionization (ESI) was used for negative and positive ionization and masses between 70 and 1050 m/z were detected.
Raw data were converted to mzXML using msConvert (ProteoWizard Toolkit v3.0.5), and target metabolites were extracted using the in-house developed tool PeakScout. Spectrum slices were presented around the exact target mass (± 50 ppm) and retention time (± 3 min) in accordance with the standards described by Sumner et al.17. For each target metabolite all peak area integrations were manually confirmed in each sample. Molecular masses of target metabolites were taken from literature and available online databases (HMDB, KEGG, Metlin)18–20. In addition, pure substances of all hydrophilic metabolites and selected lipophilic metabolites were run on the same system to obtain accurate reference retention times and fragmentation spectra.
The analytical quality of all targeted metabolites was strictly graded to be suitable for multivariate analysis and univariate analysis using the following parameters: deviation from target mass < 5 ppm, mass difference range < 10 ppm, retention time standard deviation < 0.75 min, percentage of missing values < 30%, relative standard deviation in QC after drift correction <30%, and blank load in QC < 30%. Of 164 included metabolites, 11 metabolites were considered unsuitable for ROC analysis due to lower signal intensity and lower consistency in repeated sample measurements (controls in cohorts 3 and 4).
Analytical quality of samples, blanks, QC, and UM was graded by sample median, peak shapes, retention time shifts, percentage of missing values < 30%, and position in the PCA scores plot. From cohort 2, the first two UM and from cohort 4 the last four QC did not meet the quality criteria and were therefore excluded.
Statistical analysis
Data visualisation and statistical analysis were performed with R v4.0.2 (R Core Team, 2020) (using the packages readxl, openxlsx, stringr, dplyr, tidyr, doParallel, statTarget, car, colorspace, RColorBrewer, ggplot2, ggforce, ggpmisc, ggpubr, scales, grid, ellipse, correlation, dendsort, pheatmap, nlme, emmeans, missMDA, FactoMineR, mixOmics, MetaboAnalystR 3.0.3, survival, survminer, pROC, caret, patchwork) and TIBCO Spotfire v12.5.0 (TIBCO, Palo Alto, CA). Graphpad Prism v9 has been used to assess differences in the in vitro experiments.
Typically, MS results are relative and only comparable within the same run. However, recent advances in drift correction allow to merge data for joint analysis. Peak areas without drift correction were log10-transformed prior to all further analysis. The drift correction was based on the RF driven algorithm that used QC measurements to model batch effects and drift for each metabolite with statTarget::shiftCor(., Frule = 0.7, ntree = 500, impute = “KNN”, coCV = 100, QCspan = 0, degree = 2)21. The imputed, drift-corrected data were multiplied by 103 to make the numbers more readable after log10-transformation. In the drift corrected, log10-transformed data all imputed values were removed and data was trimmed by median absolute deviation (MAD) score22, assuming a normal distribution (multiplication with 1.4826). Strong single outliers were removed with a very conservative threshold of having an absolute MAD score > 4 (165 single values in 65 metabolites). Data for all metabolites and samples is provided in Supplementary Data 1 with and without drift correction.
In order to ensure validity of drift correction and subsequent results, each measurement run was first analysed independently with unsupervised multivariate (iPCA), supervised multivariate (OPLS-DA) and univariate on log10-transformed data without drift correction (see Fig. S1). Next, drift corrected, log10-transformed data from all runs was jointly analysed with the same methods. The drift correction successfully removed the significant difference between measurement runs and notably reduced the technical variability in all metabolites (Fig. S2). Additionally, no difference was observed based on center or sample material type (Fig. S2). Replicate measurements of samples in different runs were average on drift corrected data to yield one metabolome per patient for subsequent analysis one metabolome per patient.
All reported p-values were adjusted for multiple testing according to Benjamini–Hochberg (BH) denoted as pBH (stats::p.adjust())23. Distribution and scedasticity were investigated with Kolmogorov–Smirnov test (stats::ks.test()) and Brown–Forsythe Levene-type test (car::leveneTest())24, respectively. After log10-transformation data was mostly normally distributed with 91% of all metabolites without drift correction and 99% of all metabolites with drift correction testing not significant (pBH > 0.05). Analog, data was mostly homoscedastic with 79% without drift correction and 80% with drift correction of all metabolites testing not significant (pBH > 0.05).
For iPCA missing values were imputed with missMDA::imputePCA(., ncp = 10)25 and analysis was performed scaled and centred to unit variance (z-scaled) with mixOmics::ipca(., scale = TRUE, ncomp = 2, mode = “deflation”)26.
For OPLS-DA missing values were imputed with MetaboAnalystR::ImputeMissingVar(., method = "knn_var")27, data was scaled and centred to unit variance (z-scaled) with MetaboAnalystR::Normalization(…, “AutoNorm”) and models were calculated with MetaboAnalystR::OPLSR.Anal(., reg = TRUE) with a standard 7-fold cross-validation for the factor disease. Model stability was additionally verified with 1000 random label permutations by MetaboAnalystR::OPLSDA.Permut(., num = 1000).
Pearson correlation were calculated for each metabolite (drift corrected, log10-transformed data) against each numeric clinical parameter (untransformed) with correlation::correlation()28. Results were filtered to retain only metabolites and clinical parameters with at least one significant correlation (pBH < 0.05). Retained correlations were clustered by Lance-Williams dissimilarity update with complete linkage using stats::dist() and stats::hclust(). Dendogram were sorted with dendsort::dendsort()29 at every merging point according to the average distance of subtrees and plotted at the corresponding heat maps of Pearson R with pheatmap::pheatmap()30.
For univariate analysis of significant changes within each metabolite for the factor disease (i.e. PH vs. HC/DC) generalized least squares models were fitted with nlme::gls()31,32 without confounders and with potential confounders (age, sex, BMI) by maximum likelihood. For analysis within each cohort log10-transformed data was used, for joint analysis over all cohorts drift corrected, log10-transformed data was used, thus constituting a nonlinear approach. The three most common potential confounders (age, sex, BMI) were added stepwise in all possible combinations and model performances were compared within each metabolite by lower AIC (Akaike information criterion; relative estimate of information loss), higher log-likelihood (goodness of fit), significance in log-likelihood ratio test comparing two models, quality of Q-Q plots, randomness in residual and direct comparison of t-ratios. All models with any confounder combination showed significant influence (p < 0.01) on selected few metabolites (13–39). The model with age + sex impacted most metabolites. However, a direct comparison of t-ratio revealed a very small impact of age or sex correction on results, and according to model parsimony models without confounders were reported throughout.
FFA/lipid-ratios were calculated with FFAs in the numerator and lipids in the denominator. The numerators were all possible, summed (not weighed) combinations of up to six FFA from 11 FFAs (most significant in univariate analysis PH versus HC/DC and best analytical quality): FFA C15:0, FFA C16:2, FFA C16:1, FFA C17:1, FFA C17:0, FFA C18:3, FFA C18:2, FFA C18:1, FFA C19:1, FFA C20:5 and FFA C20:1. The denominators were all possible, summed (not weighed) combinations of up to four lipids from eight lipids (no significant change in univariate analysis and best analytical quality): LPC 18:2, LPC 18:1, PC 36:4, PC 38:6, SM 34:2, SM 36:2, LPE 16:0, and LPE 18:1. The combination of all possible summed FFA numerator and summed lipid denominator yielded a total of 240 570 different FFA/lipid-ratios. All used FFA and lipids had no missing values. The technical variability RSD in QC was calculated for each FFA/lipid-ratio following the rules of error propagation from the single metabolites RSD of QC in drift corrected, log10-transformed data. All FFA/lipid-ratios were calculated once without and once with drift correction (both log10-transformed). Diagnostic performance was tested by ROC analysis with pROC::roc(…, algorithm = 2)33 based on our training cohort. Test data like in machine learning approaches was not needed here because ratios are directly calculated without any model training. Therefore logistic regression was performed fitting training data with stats::glm(…, family = “binomial”). The performance was evaluated on the validation cohort using pROC::roc(). The optimal threshold was determined with pROC::coords(…,,best.method = "closest.topleft") to determine sensitivity and specificity.
Survival analysis used either times since sampling or times since diagnosis defining confirmed death as endpoint while censoring all others at time of last known follow-up. Impact of all relevant clinical parameters (untransformed), all single metabolites (drift corrected, log10-transformed data) and best performing FFA/lipid-ratios (log10-transformed data without drift correction) was analysed. FPHR4p scores were inverted so that higher values represent higher mortality risk. Numerical parameters were split into high and low with maxstat as optimal cut-off for survival prediction as determined by survminer::surv_cutpoint(…, minprop = 0.3)34. Kaplan–Meier curves were fitted for each category with survival::survfit()35, differences were tested with survminer::ggsurvplot() and plots with time since diagnosis were truncated at 15 years for better comparability with times since baseline (i.e. time since sampling). The Cox HR analysis36 was calculated with survminer::coxph() for the confounders (age, sex, BMI), the COMPERA 2.0 score, FPHR4p, and RATIO1 alone or in combinations. Numeric factors were categorized into high and low same as for Kaplan-Meier curves. The combination of RATIO1 with the FPHR4p or COMPERA 2.0 score was done additively with the same weighting on scaled values from 0 to 1, rescaling after addition.
Data visualisation and calculation of the machine learning and Cox HR analysis was performed with Python 3.9 (using the packages pandas, numpy, seaborn, sklearn, matplotlib, xgboost)37–40.
For machine learning, the package sklearn41 was used for the random forest (RF) and package xgboost for the XGBoost42 implementation, for better reproducibility a fixed random seed was set. Data was normalized with mean 0 and variance 1. A hyperparameter search for number of trees {101, 301, 1001, 2001, 3001} and depth {5, 10, 100, 200, 300} for RF and eta {0.1, 0.01, 0.001}, depth {5, 10, 100, 200, 300} and n_estimators {101, 301, 1001, 2001, 3001} for XGBoost was conducted with finally used hyperparameters highlighted in bold. Models were trained on training cohort data, which was randomly further divided five times into 80% for training and 20% for testing (stratified by class, age, sex, with non-overlapping test data). Trained models were validated with the external validation cohort, which had no data overlap with the training cohort.
Additionally to the original split by center (i.e. city of sample origin), all samples were artificially split into 70% training and 30% validation sets with balanced distributions in age, BMI, sex and class (PH/DC/HC) to overcome the potential bias from the unequal distribution of age, BMI, sex and class in the original training and validation cohorts by center. The distribution of age and BMI was equal according to a t-test as well as for sex and class (PH/HC/DC) according to a χ2 test (p > 0.2). All machine learning and FFA/lipid-ratio ROC analysis were repeated for these 70:30 training and validation sets.
Language editing was aided by the artificial intelligence tool https://instatext.io/ (last accessed November 2023).
Supplementary Material
Acknowledgements
We are very grateful for the excellent technical assistance from Elisabeth Blanz, Sabine Halsegger, Daniela Kleinschek, Jessica Schweiger, Yasemin Gassner, Gert Trausinger and Edgar Gander. We express our heartfelt gratitude to Gabor Kovacs, Saskia Trescher, Pablo López-García, Sophie Narath, Michael Pienn and Peter Wolf for their valuable discussions and helpful advices.
Funding
NB, TP disclose that part of this work has been carried out with the K1 COMET Competence Center CBmed, which is funded by the Federal Ministry of Transport, Innovation and Technology; the Federal Ministry of Science, Research and Economy; Land Steiermark (Department 12, Business and Innovation); the Styrian Business Promotion Agency; and the Vienna Business Agency. The COMET program is executed by the Österreichische Forschungsförderungs GmbH FFG. VB is supported by the Austrian Science Foundation (FWF, T1032-B34).
Abbreviations
- 6MWD
six minute walking distance
- ACh
acetylcholine
- ADMA
asymmetric dimethylarginine
- AUC
area under the curve
- BH
Benjamini–Hochberg
- BMI
body mass index
- BNP or NT-proBNP
natriuretic peptide levels
- CO
cardiac output
- COPD
chronic obstructive pulmonary disease
- CI
cardiac index
- CTEPH
chronic thromboembolic pulmonary hypertension
- DC
diseased control (non-PH)
- DLCOcVA
diffusing capacity for carbon monoxide per alveolar volume, hemoglobin corrected
- ECAR
extracellular acidification rate
- ECIS
electrical cell-substrate impedance sensor
- EDTA
ethylenediaminetetraacetic acid
- FEV1
forced expiratory volume/ 1 s
- FVC
forced vital capacity
- FFA
free fatty acids
- H&E
hematoxylin-eosin
- HC
healthy control
- HILIC
hydrophilic interaction liquid chromatography
- hPAEC
human pulmonary artery endothelial cells
- hPASMC
human pulmonary artery smooth muscle cells
- HR
hazard ratio
- HRMS
high resolution mass spectrometry
- ILD
interstitial lung disease
- IPAH
idiopathic pulmonary arterial hypertension
- iPCA
independent principal component analysis
- IVD
in vitro diagnostics
- LPC
lysophosphatidylcholine
- LPE
lysophosphatidylethanolamine
- LV
left ventricle
- MAD
median absolute deviation
- mPAP
mean pulmonary arterial pressure
- MS
mass spectrometry
- OCR
oxygen consumption rate
- OPLS-DA
orthogonal projections to latent structures discriminant analysis
- PA
pulmonary arteries
- PAH
pulmonary arterial hypertension
- PAP
pulmonary arterial pressure
- PAWP
pulmonary arterial wedge pressure
- PBS
phosphate-buffered saline
- PC
phosphatidylcholine
- PDGF
platelet-derived growth factor
- PH
pulmonary hypertension
- PPP
pentose phosphate pathway
- PVR
pulmonary vascular resistance
- QC
pooled from samples for quality control
- RAP
right atrial pressure
- RDW
red cell distribution width
- RF
random forest
- RHC
right heart catheterization
- ROC
receiver operator curve
- RT
room temperature (20 – 25 °C)
- SEM
standard error of mean
- SM
sphingomyelin
- SMA
smooth muscle actin
- SvO2
mixed venous oxygen saturation
- TAG
triacylglyceride
- TEER
transendothelial electrical resistance
- TLC
total lung capacity
- VWF
von-Willebrand Factor
- WHO FC
World Health Organization functional class
- WU
Wood unit
- XGBoost
eXtreme Gradient Boosting
Footnotes
Competing interests
Several authors (NB, CM, AO, BMN, HO) are inventors of the patent “Biomarker for the diagnosis of pulmonary hypertension (PH)” WO2017153472A1 (priority date 09.03.2016, granted in US, KR, JP, pending in CA, EP, AU) being jointly held by CBmed Gmbh, Joanneum Research Forschungsgesellschaft mbH, Medical University Graz and Ludwig Boltzmann Gesellschaft GmbH. The authors received no personal financial gain from the patent.
During work on this publication NB was partially employed at CBmed GmbH. TP is chief scientific officer (CSO) of CBmed GmbH. EZ and CM were employed at Joanneum Research Forschungsgesellschaft mbH. The employing companies provided support in the form of salaries, materials and reagents but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.
VF received honoraria for lectures, presentations, speakers bureaus, manuscript writing, or educational events from Janssen, Chiesi, BMS, and Boehringer Ingelheim and support for attending meetings, and/or travel from Janssen, MSD, and Boehringer Ingelheim outside the submitted work.
CN received support for attending meetings, and/or travel from Boehringer Ingelheim and Inventiva pharma outside the submitted work.
BAM reports personal fees from Actelion Pharmaceuticals, Tenax and Regeneron, grants from Deerfield Company, NIH (5R01HL139613-03, R01HL163960, R01HL153502, R01HL155096-01), Boston Biomedical Innovation Center (BBIC), Brigham IGNITE award, Cardiovascular Medical research Education Foundation outside the submitted work. BAM reports patent PCT/US2019/059890 (pending), PCT/US2020/066886 (pending) and #9,605,047 (granted) not licensed and outside the submitted work.
SU received grants from the Swiss National Science Foundation, Zürich and Swiss Lung League, EMDO-Foundation, Orpha-Swiss, Janssen and MSD all unrelated to the present work. SU received consultancy fees and travel support from Orpha-Swiss, Janssen, MSD and Novartis unrelated to the present work.
TJL reports grants for his institution from Acceleron Pharma, Gossamer Bio, Janssen-Cilag, and United Therapeutics; personal fees and non-financial support from Acceleron Pharma, AstraZeneca, Boehringer Ingelheim, Gossamer Bio, Ferrer, Janssen-Cilag, MSD, Orphacare, and Pfizer outside the submitted work.
KH is a consultant at Medtronic Österreich GmbH outside the submitted work.
TP reports grants from AstraZeneca, Novo Nordisk, Sanofi paid to the Medical University of Graz outside the submitted work. TP reports personal fees and nonfinancial support from Novo Nordisk and Roche Diagnostics outside the submitted work.
HO reports grants from Bayer, Unither, Actelion, Roche, Boehringer Ingelheim, and Pfizer. HO reports personal fees and non-financial support from Medupdate and Mondial, AOP, Astra Zeneca, Bayer, Boehringer Ingelheim, Chiesi, Ferrer, Menarini, MSD, and GSK, Iqvia, Janssen, Novartis, and Pfizer outside the submitted work.
AO received honoraria for presentations and support for attending meetings, and/or travel from MSD outside the submitted work.
No conflict of interest, financial or otherwise, are declared by the authors HL and UB.
Supplemental Information
Supplemental Information can be found attached to this publication.
Materials Availability
This study did not generate new unique reagents.
Data and Code Availability
The author declare that all data supporting the findings in this study are available in the online supplementary data 1 and online repositories. Mass spectrometric data have been deposited in https://zenodo.org under doi: 10.5281/zenodo.7857706. Data is provided de-identified and is available immediately after publication with no end for those who wish to access the data for any purpose. The provided Sample_Name in the online supplementary data 1 links to the file names in the online repository and are unsuitable to identify single patients. The primary key is only known to part of the study team. Machine learning code is available immediately with no end for those who wish to access for any purpose on Github: https://github.com/HelgaLudwig/PHMetab.
References
- 1.Humbert M, Kovacs G, Hoeper MM, et al. 2022 ESC/ERS Guidelines for the diagnosis and treatment of pulmonary hypertension. Eur Respir J; 61. Epub ahead of print January 2023. DOI: 10.1183/13993003.00879-2022. [DOI] [PubMed] [Google Scholar]
- 2.Hoeper MM, Humbert M, Souza R, et al. A global view of pulmonary hypertension. Lancet Respir Med 2016; 4: 306–322. [DOI] [PubMed] [Google Scholar]
- 3.Vachiéry J-L, Tedford RJ, Rosenkranz S, et al. Pulmonary hypertension due to left heart disease. European Respiratory Journal 2019; 53: 1801897.30545974 [Google Scholar]
- 4.Nathan SD, Barbera JA, Gaine SP, et al. Pulmonary hypertension in chronic lung disease and hypoxia. Eur Respir J; 53. Epub ahead of print 2019. DOI: 10.1183/13993003.01914-2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Maron BA, Brittain EL, Hess E, et al. Pulmonary vascular resistance and clinical outcomes in patients with pulmonary hypertension: a retrospective cohort study. Lancet Respir Med 2020; 8: 873–884. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Douschan P, Kovacs G, Avian A, et al. Mild Elevation of Pulmonary Arterial Pressure as a Predictor of Mortality. Am J Respir Crit Care Med 2018; 197: 509–516. [DOI] [PubMed] [Google Scholar]
- 7.Farber HW, Miller DP, Poms AD, et al. Five-Year outcomes of patients enrolled in the REVEAL Registry. Chest 2015; 148: 1043–54. [DOI] [PubMed] [Google Scholar]
- 8.Zelt JGE, Sugarman J, Weatherald J, et al. Mortality trends in pulmonary arterial hypertension in Canada: a temporal analysis of survival per ESC/ERS guideline era. Eur Respir J; 59. Epub ahead of print June 2022. DOI: 10.1183/13993003.01552-2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Kräuter C, Reiter U, Kovacs G, et al. Automated vortical blood flow-based estimation of mean pulmonary arterial pressure from 4D flow MRI. Magn Reson Imaging 2022; 88: 132–141. [DOI] [PubMed] [Google Scholar]
- 10.Brittain EL, Talati M, Fessel JP, et al. Fatty acid metabolic defects and right ventricular lipotoxicity in human pulmonary arterial hypertension. Circulation 2016; 133: 1936–1944. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Hemnes AR, Luther JM, Rhodes CJ, et al. Human PAH is characterized by a pattern of lipid-related insulin resistance. JCI Insight; 4. Epub ahead of print 10 January 2019. DOI: 10.1172/jci.insight.123611. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Lewis GD, Ngo D, Hemnes AR, et al. Metabolic Profiling of Right Ventricular-Pulmonary Vascular Function Reveals Circulating Biomarkers of Pulmonary Hypertension. J Am Coll Cardiol 2016; 67: 174–189. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Rhodes CJ, Ghataorhe P, Wharton J, et al. Plasma Metabolomics Implicates Modified Transfer RNAs and Altered Bioenergetics in the Outcomes of Pulmonary Arterial Hypertension. Circulation 2017; 135: 460–475. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Nagy BM, Nagaraj C, Meinitzer A, et al. Importance of kynurenine in pulmonary hypertension. American Journal of Physiology-Lung Cellular and Molecular Physiology 2017; 313: L741–L751. [DOI] [PubMed] [Google Scholar]
- 15.Nagy BM, Kovacs G, Tornyos A, et al. No indication of insulin resistance in idiopathic PAH with preserved physical activity. European Respiratory Journal; 55. Epub ahead of print 1 May 2020. DOI: 10.1183/13993003.01228-2019. [DOI] [PubMed] [Google Scholar]
- 16.Wertheim BM, Wang RS, Guillermier C, et al. Proline and glucose metabolic reprogramming supports vascular endothelial and medial biomass in pulmonary arterial hypertension. JCI Insight; 8. Epub ahead of print 22 February 2023. DOI: 10.1172/jci.insight.163932. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Breiman L. Random forests. Mach Learn 2001; 45: 5–32. [Google Scholar]
- 18.Breiman L, Friedman JH (Jerome H), Olshen RA, et al. Classification and regression trees. 1st ed. 1984.
- 19.Chen T, Guestrin C. XGBoost. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, NY, USA: ACM, 2016, pp. 785–794. [Google Scholar]
- 20.Boucly A, Weatherald J, Savale L, et al. Risk assessment, prognosis and guideline implementation in pulmonary arterial hypertension. European Respiratory Journal; 50. Epub ahead of print 1 August 2017. DOI: 10.1183/13993003.00889-2017. [DOI] [PubMed] [Google Scholar]
- 21.Hoeper MM, Pausch C, Olsson KM, et al. COMPERA 2.0: a refined four-stratum risk assessment model for pulmonary arterial hypertension. European Respiratory Journal; 60. Epub ahead of print 1 July 2022. DOI: 10.1183/13993003.02311-2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Talati MH, Brittain EL, Fessel JP, et al. Mechanisms of Lipid Accumulation in the Bone Morphogenetic Protein Receptor Type 2 Mutant Right Ventricle. Am J Respir Crit Care Med 2016; 194: 719–28. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Hemnes AR, Brittain EL, Trammell AW, et al. Evidence for right ventricular lipotoxicity in heritable pulmonary arterial hypertension. Am J Respir Crit Care Med 2014; 189: 325–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Bhatt-Wessel B, Jordan TW, Miller JH, et al. Role of DGAT enzymes in triacylglycerol metabolism. Arch Biochem Biophys 2018; 655: 1–11. [DOI] [PubMed] [Google Scholar]
- 25.Chambers KT, Cooper MA, Swearingen AR, et al. Myocardial Lipin 1 knockout in mice approximates cardiac effects of human LPIN1 mutations. JCI Insight; 6. Epub ahead of print 2021. DOI: 10.1172/jci.insight.134340. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Reue K. The lipin family: mutations and metabolism. Curr Opin Lipidol 2009; 20: 165–70. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Takeuchi K, Reue K. Biochemistry, physiology, and genetics of GPAT, AGPAT, and lipin enzymes in triglyceride synthesis. Am J Physiol Endocrinol Metab 2009; 296: E1195–209. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Paulin R, Michelakis ED. The Metabolic Theory of Pulmonary Arterial Hypertension. Circ Res 2014; 115: 148–164. [DOI] [PubMed] [Google Scholar]
- 29.Pi H, Xia L, Ralph DD, et al. Metabolomic Signatures Associated With Pulmonary Arterial Hypertension Outcomes. Circ Res 2023; 132: 254–266. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Harbaum L, Rhodes CJ, Wharton J, et al. Mining the Plasma Proteome for Insights into the Molecular Pathology of Pulmonary Arterial Hypertension. Am J Respir Crit Care Med 2022; 205: 1449–1460. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Rhodes CJ, Wharton J, Swietlik EM, et al. Using the Plasma Proteome for Risk Stratifying Patients with Pulmonary Arterial Hypertension. Am J Respir Crit Care Med 2022; 205: 1102–1111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Marra AM, Wei Y, Zhao H, et al. The Impact of Abnormal Lipid Metabolism on the Occurrence Risk of Idiopathic Pulmonary Arterial Hypertension. International Journal of Molecular Sciences 2023, Vol 24, Page 14280 2023; 24: 14280. [DOI] [PMC free article] [PubMed] [Google Scholar]
Online References
- 1.Kovacs G, Avian A, Wutte N, et al. Changes in pulmonary exercise haemodynamics in scleroderma: a 4-year prospective study. Eur Respir J; 50. Epub ahead of print 2017. DOI: 10.1183/13993003.01708-2016. [DOI] [PubMed] [Google Scholar]
- 2.Kovacs G, Avian A, Bachmaier G, et al. Severe Pulmonary Hypertension in COPD: Impact on Survival and Diagnostic Approach. Chest 2022; 162: 202–212. [DOI] [PubMed] [Google Scholar]
- 3.Haudum CW, Kolesnik E, Colantonio C, et al. Cohort profile: ‘Biomarkers of Personalised Medicine’ (BioPersMed): a single-centre prospective observational cohort study in Graz/Austria to evaluate novel biomarkers in cardiovascular and metabolic diseases. BMJ Open 2022; 12: e058890. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Galiè N, Humbert M, Vachiery J-L, et al. 2015 ESC/ERS Guidelines for the diagnosis and treatment of pulmonary hypertension: The Joint Task Force for the Diagnosis and Treatment of Pulmonary Hypertension of the European Society of Cardiology (ESC) and the European Respiratory Society (ERS): Endorsed by: Association for European Paediatric and Congenital Cardiology (AEPC), International Society for Heart and Lung Transplantation (ISHLT). Eur Respir J 2015; 46: 903–75. [DOI] [PubMed] [Google Scholar]
- 5.Hoffmann J, Wilhelm J, Marsh LM, et al. Distinct differences in gene expression patterns in pulmonary arteries of patients with chronic obstructive pulmonary disease and idiopathic pulmonary fibrosis with pulmonary hypertension. Am J Respir Crit Care Med 2014; 190: 98–111. [DOI] [PubMed] [Google Scholar]
- 6.Nagaraj C, Tabeling C, Nagy BM, et al. Hypoxic vascular response and ventilation/perfusion matching in end-stage COPD may depend on p22phox. Eur Respir J; 50. Epub ahead of print 1 July 2017. DOI: 10.1183/13993003.01651-2016. [DOI] [PubMed] [Google Scholar]
- 7.Papp R, Nagaraj C, Zabini D, et al. Targeting TMEM16A to reverse vasoconstriction and remodelling in idiopathic pulmonary arterial hypertension. Eur Respir J; 53. Epub ahead of print 1 June 2019. DOI: 10.1183/13993003.00965-2018. [DOI] [PubMed] [Google Scholar]
- 8.Skofic Maurer D, Zabini D, Nagaraj C, et al. Endothelial Dysfunction Following Enhanced TMEM16A Activity in Human Pulmonary Arteries. Cells; 9. Epub ahead of print 28 August 2020. DOI: 10.3390/CELLS9091984. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Stulnig G, Frisch MT, Crnkovic S, et al. Docosahexaenoic acid (DHA)-induced heme oxygenase-1 attenuates cytotoxic effects of DHA in vascular smooth muscle cells. Atherosclerosis 2013; 230: 406–413. [DOI] [PubMed] [Google Scholar]
- 10.Kikuchi N, Satoh K, Kurosawa R, et al. Selenoprotein P Promotes the Development of Pulmonary Arterial Hypertension: Possible Novel Therapeutic Target. Circulation 2018; 138: 600–623. [DOI] [PubMed] [Google Scholar]
- 11.Biasin V, Marsh LM, Egemnazarov B, et al. Meprin β, a novel mediator of vascular remodelling underlying pulmonary hypertension. J Pathol 2014; 233: 7–17. [DOI] [PubMed] [Google Scholar]
- 12.Bajad SU, Lu W, Kimball EH, et al. Separation and quantitation of water soluble cellular metabolites by hydrophilic interaction chromatography-tandem mass spectrometry. J Chromatogr A 2006; 1125: 76–88. [DOI] [PubMed] [Google Scholar]
- 13.Yuan M, Breitkopf SB, Yang X, et al. A positive/negative ion-switching, targeted mass spectrometry-based metabolomics platform for bodily fluids, cells, and fresh and fixed tissue. Nat Protoc 2012; 7: 872–81. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Fröhlich EE, Farzi A, Mayerhofer R, et al. Cognitive impairment by antibiotic-induced gut dysbiosis: Analysis of gut microbiota-brain communication. Brain Behav Immun 2016; 56: 140–155. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Mueller KM, Hartmann K, Kaltenecker D, et al. Adipocyte Glucocorticoid Receptor Deficiency Attenuates Aging- and Hfd-Induced Obesity, and Impairs the Feeding-Fasting Transition. Diabetes 2017; 66: 272–286. [DOI] [PubMed] [Google Scholar]
- 16.Vogel FCE, Bordag N, Zügner E, et al. Targeting the H3K4 Demethylase KDM5B Reprograms the Metabolome and Phenotype of Melanoma Cells. Journal of Investigative Dermatology 2019; 139: 2506–2516.e10. [DOI] [PubMed] [Google Scholar]
- 17.Sumner LW, Amberg A, Barrett D, et al. Proposed minimum reporting standards for chemical analysis Chemical Analysis Working Group (CAWG) Metabolomics Standards Initiative (MSI). Metabolomics 2007; 3: 211–221. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Guijas C, Montenegro-Burke JR, Domingo-Almenara X, et al. METLIN: A Technology Platform for Identifying Knowns and Unknowns. Anal Chem 2018; 90: 3156. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Kanehisa M, Sato Y, Kawashima M, et al. KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res 2016; 44: D457–D462. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Wishart DS, Feunang YD, Marcu A, et al. HMDB 4.0: The human metabolome database for 2018. Nucleic Acids Res 2018; 46: D608–D617. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Luan H, Ji F, Chen Y, et al. statTarget: A streamlined tool for signal drift correction and interpretations of quantitative mass spectrometry-based omics data. Anal Chim Acta 2018; 1036: 66–72. [DOI] [PubMed] [Google Scholar]
- 22.Leys C, Ley C, Klein O, et al. Journal of Experimental Social Psychology Detecting outliers : Do not use standard deviation around the mean, use absolute deviation around the median. Experimental Social Psychology 2013; 4–6. [Google Scholar]
- 23.Benjamini Y, Hochberg Y. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society: Series B (Methodological) 1995; 57: 289–300. [Google Scholar]
- 24.Brown MB, Forsythe AB. Robust Tests for the Equality of Variances. J Am Stat Assoc 1974; 69: 364. [Google Scholar]
- 25.Husson F, Josse J. missMDA: Handling Missing Values with Multivariate Data Analysis. R package version 1.8.2., http://cran.r-project.org/package=missMDA (2015). [Google Scholar]
- 26.Rohart F, Gautier B, Singh A, et al. mixOmics: An R package for ‘omics feature selection and multiple data integration. PLoS Comput Biol 2017; 13: e1005752. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Pang Z, Chong J, Li S, et al. MetaboAnalystR 3.0: Toward an Optimized Workflow for Global Metabolomics. Metabolites 2020; 10: 186. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Makowski D, Ben-Shachar M, Patil I, et al. Methods and Algorithms for Correlation Analysis in R. J Open Source Softw 2020; 5: 2306. [Google Scholar]
- 29.Sakai R, Biederstedt E. dendsort: modular leaf ordering methods for dendrogram representations in R. Epub ahead of print 2021. DOI: 10.12688/f1000research.4784.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Kolde R. pheatmap: Pretty Heatmaps. R package version 1.0.12, https://CRAN.R-project.org/package=pheatmap (2019, accessed 13 February 2023). [Google Scholar]
- 31.Pinheiro J, Bates D, R Core Team. nlme: Linear and Nonlinear Mixed Effects Models, https://CRAN.R-project.org/package=nlme (2022, accessed 13 February 2023).
- 32.Pinheiro J, Bates D, DebRoy S, et al. nlme: Linear and Nonlinear Mixed Effects Models. R package version 3.1–122 2015; 3: 57. [Google Scholar]
- 33.Robin X, Turck N, Hainard A, et al. pROC: An open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics 2011; 12: 77. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Kassambara A, Kosinski M, Biecek P. survminer: Drawing Survival Curves using ‘ggplot2’, https://CRAN.R-project.org/package=survminer (2021, accessed 13 February 2023).
- 35.Therneau TM. Package for Survival Analysis in R, https://CRAN.R-project.org/package=survival (2021, accessed 13 February 2023). [Google Scholar]
- 36.Cox DR. Regression Models and Life-Tables. Journal of the Royal Statistical Society: Series B (Methodological) 1972; 34: 187–202. [Google Scholar]
- 37.Harris CR, Millman KJ, van der Walt SJ, et al. Array programming with NumPy. Nature 2020; 585: 357–362. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.McKinney W. Data Structures for Statistical Computing in Python. In: PROC. OF THE 9th PYTHON IN SCIENCE CONF. (SCIPY 2010). 2010, pp. 56–61. [Google Scholar]
- 39.Waskom M, Botvinnik O, O’Kane D, et al. mwaskom/seaborn: v0.8.1 (September 2017). Epub ahead of print 2017. DOI: 10.5281/zenodo.883859. [DOI]
- 40.Hunter JD. Matplotlib: A 2D Graphics Environment. Comput Sci Eng 2007; 9: 90–95. [Google Scholar]
- 41.Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 2011; 12: 2825–2830. [Google Scholar]
- 42.Chen T, Guestrin C. XGBoost. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, NY, USA: ACM, 2016, pp. 785–794. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The author declare that all data supporting the findings in this study are available in the online supplementary data 1 and online repositories. Mass spectrometric data have been deposited in https://zenodo.org under doi: 10.5281/zenodo.7857706. Data is provided de-identified and is available immediately after publication with no end for those who wish to access the data for any purpose. The provided Sample_Name in the online supplementary data 1 links to the file names in the online repository and are unsuitable to identify single patients. The primary key is only known to part of the study team. Machine learning code is available immediately with no end for those who wish to access for any purpose on Github: https://github.com/HelgaLudwig/PHMetab.